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ANTHROPOLOGICAL OBSERVATIONS ON THE ANGLO- 
INDIANS OF CALCUTTA. 


Part I. ANALYSIS OF MALE STATURE. 
By PRASANTA CHANDRA MAHALANOBIS, B.Sc., B.A, (Cantab), 


> Indian Educational Service, Professor of Physics, Presidency 

4 College, Calcutta. 

(Plates I—IV.) 

7 : INTRODUCTORY NOTE. 

= The people with whom these papers will deal are those offi- 


_ cially called “‘ Anglo-Indians’’ in India. They are not, however, 
; the Anglo-Indian of English literature and common parlance 
-in which the term is applied to persons of English, or rather 


= British, birth who have spent a considerable part of their lives in 
a India. Some years ago the Government of India, seeking to 
avoid the associations that had grown up round the name Eura- 
; sian, decided that persons of mixed Indian and European blood 
e should be known henceforth as Anglo-Indians.!’ The word Eura- 
= sians had itself been invented to avoid a coarser and more des- 


ctiptive term. That even the more recent designation was inac- 
curate in point of fact was pointed out at the time of its intro- 
duction in a letter published in a Calcutta newspaper and signed 
‘“Franco-Burman.’’ ‘The term Indian, indeed, had been stretched 
to include all native denizens of the Indian Empire— Burmese, 
Baluchis, etc., as well as Indians properly so-called; while it had 
been forgotten that any other European nation but the English 
had ever had a part in India. 

The observations on which Professor Mahalanobis’ analyses 
are based had their origin as follows. Ever since I began to take 
a serious interest in anthropometry, I have had doubts as to the 
value of bodily measurements taken on the living person. So 
long ago as 1903,” I pointed out that my own measurements of the 
faces of the people of the Faroe Islands were completely at vari- 
ance with those of a previous observer, and attributed the 
different tesults mainly to slight difference in technique. The 
working out of the measurements of the various tribes of the 
Malay Peninsula obtained in rgo1-1902° by Mr. H. C. Robinson 
and myself increased my doubts, and further made me suspicious 


{ | understand, however, that as early as 1830 the term Anglo-Indians had 
'.already been applied to persons of mixed descent. 
2 Annandale, Proc. Rey. Soc. Edinburgh XXV, pp. 2-24 (1903). 
Annandale and Robinson, Fascicule Malayenses, Anthropology (1903— 
1904). 
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that there was some inherent falacy in the whole method. These 
measurements were taken with special care, each individual being 
measured three times over and most by two observers. Although © 
they showed the gross differences in head-measurements between 
the civilized and the uncivilized tribes, they failed completely to 
demonstrate differences between the heads of the Negrito and of 
the Indonesian jungle tribes. 

Having in 1916 an opportunity of examining a number of 
Anglo-Indians anthropometrically, I determined to see whether 
my doubts were further justified by the investigation of a race 
known to be of recent mixed origin. Before discussing the methods 
adopted, I must say a few words about my subjects. They were 
with very few exceptions, young men between the ages of 18 and 
40, and with few exceptions belonged to what I may call the 
middle class of so-called Anglo-Indians, mostly employed as clerks, 
mechanical engineers, overseers and so forth, or else fresh from 
school and about to take up employment of the kind. The fact 
is of importance, for social distinctions are somewhat rigidly 
maintained in this community. I am indebted to Mr. H. A. 
Stark, late Principal, Dacca Training College, now Principal, 
Armenian College, Calcutta, for valuable information on the point. 
Among the Anglo-Indian community of Calcutta some families 
claim descent from Mahommedan ladies of noble and even prince- 
ly birth, who in the old days entered into alliances of a perfectly 
regular kind from a Mahommedan point of view with Englishmen 
of good birth. These families are, however, comparatively few. 
At the other end of the-social scale are the “‘ Kintalis’’,’ whose 
origin is thus described by Mr. Stark in a lecture on “‘ Calcutta in 
Slavery Days’’ read before the Calcutta Social Study Society on 
March 13th, 1916. 

‘‘The liberated slaves [who, as Mr. Stark had previously 
explained, were mainly Indians but included not a few Negros} 
unbeknown to themselves that they had been doing what the 
Manumitted Roman slaves had done centuries before, in gratitude 
assumed the surnames of their late masters. ‘Their descendants, 
for the most part, survive in the ‘‘Kintal’’ population of the 
city.’’ 

If this were a full statement of the case, it might be doubted 
whether the Kintalis have any real claim to be of mixed race, 
unless there is some slight admixture of Negro blood; but, as in 
all cities, there is a tendency for certain individuals of the more 
respectable classes to sink down to the slums and become a part 
of the submerged population, which is represented in Calcutta, 
so far as the Christian communities are concerned, by the Kintalis. 

Be this as it may, few or no Kintalis are among the persons 
I measured, and probably none of very old family. So far as 
possible, moreover, we have eliminated from the measurements 


| The name is derived from the lodging-houses (Kintal) in which many of 
these people live or lived. The word Kintal, however, now means little more 
than a slum inhabited by low-class Christians. 
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analysed those of persons known to have recent Negro or Mongo- 
loid blood, 7.e. persons one of whose parents or grandparents 
was a Negro or belonged to a Mongoloid stock. This has been a 
necessary precaution, because the number of individuals in which 
the further complexity was introduced was large enough to affect 
the results without being sufficiently numerous to afford a sound 
basis for mathematical treatment. So far as recent Negro blood 
was concerned I was fairly confident in accepting the statements 
of those who offered themselves for measurement, as certain, not 
by any means all, Negro traits were present. I refer particularly 
to woolly hair, dark complexion, negroid nose and prognathism. 
The long lower limb and slender shin of the Negro, which is of a 
different type from that of the Indian, were not perpetuated in a 
single individual.! As to old Negro blood, no definite information 
was obtained. | 

To eliminate the recent Mongoloid element from our inves- 
tigations was, however, a much less easy task and I am by no 
means sure that this has been done successfully. Here again I 
had to trust to the statements of individuals measured, but 
Mongoloid traits are often reproduced in a much more subtle 
manner than Negroid, and the Mongoloid element in the popula- 
tion of Calcutta is much larger than the Negroid. Indeed, I 
have observed that many of the most intelligent Anglo-Indians 
with whom I have had dealings have had distinctly Mongoloid 
features. This is not surprising, for the offspring of women of 
the various Mongoloid tribes of the Himalayas, Assam and 
Burma, who are not generally averse to unions of a more or less 
permanent nature with educated Europeans settled in their dis- 
tricts, are not only of respectable parentage in both lines but 
often receive a good education, and Calcutta is the natural goal of 
such people. So far as I could discover, it is unusual for an 
Anglo-Indian to know much of his family for more than two or 
three generations back and at the present time, in Calcutta at 
any rate, most of the community are the result of marriages of 
persons of mixed blood.’ 

The subjects of my investigations were, therefore, mainly of 
mixed Indo-European blood, probably in many individuals with 
some Mongoloid admixture, but not affiliated with the higher 
Hindu castes. 

The measurements were taken in the zoological laboratory of 
the Indian Museum in the years 1916—-1919. I had the help of 


! As only about half a dozen Anglo-Indian-Negros were examined, I have 
refrained from giving details and merely cite the results for what they are worth. 
Recent Negro settlers in Calcutta are mostly West Indians. They and their 
families occupy a street practically by themselves. 

2 I may here note that further complexity is now being introduced into the 
Anglo-Indian community by the marriage of Anglo-Indian women to Canton 
Chinese, who are now numerous as cabinet-makers and bootmakers in Calcutta. 
These men keep themselves entirely apart from the Indian communities and 
frequently marry Anglo-Indians, though the custom of bringing their wives from 
China is becoming much common than it was a few years ago. 
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several assistants, among whom I may mention in particular my 
late laboratory assistant Mr. J. Caunter, to whom I was indebted 
for obtaining many of my subjects. Dr. F. H. Gravely and 
Dr. K. S. Roy devoted much time and labour to helping me. 

The investigations were conducted in a less systematic manner — 
than I would have wished, partly because they were in themselves 
of the nature of an experiment and I was perpetually attempting 
to discover more satisfactory methods, and partly because they 
had to be carried out at odd times, chiefly on Sundays and holi- 
days, when subjects were available. The measurements that 
have been utilised by Prof. Mahalanobis were, however, made on 
one system and with the same instruments. ‘The system was that 
recommended in the British Association’s hand-book on anthro- 
pology and the instruments were the ‘‘ Anthropometer’’ (112) and 
‘““ Instrumentantascher ’’ (203) supplied by Hermann of Zurich. 

Prof. Mahalanobis has, in my opinion wisely, decided to 
treat. the measurements as accurate only within 2mm. He notes 
a tendency on my part to favour even numbers. Of this I was 
barely conscious at the time, but on attempting to reconstruct 
the process in my mind I seem to recollect that when I was not 
quite sure of a measurement within a millimetre, I had a preju- 
dice in favour of even numbers, I never thought it possible to 
measure to within less than a millimetre. It is curious, however, 
that this prejudice seems to have communicated itself to my assis- 
tants, by several of whom the measurements were occasionally 
taken while I noted them down. ‘That it has done so is evidence 
at any rate of uniformity of method. 

The measurements, discussed without knowledge of mathe- 
matics, seemed to me so unsatisfactory that I had practically 
decided to reject them altogether, until I was so fortunate as to 
get into touch with Prof. Mahalanobis at the Nagpur meeting of 
the Indian Science Congress and he offered to analyse them 
statistically. The results he has already obtained seem to justify 
their publication, and to emphasize the value of co-operation and 
co-ordination of different branches of scientific work in anthro- 
pology, without which, in my opinion, further progress in most 
branches of biology has become impossible. 

The special importance of investigations conducted on the 
Anglo-Indians lies in the fact that although we may not be able 
to trace out the history of any one family, we know that the 
whole race, if such it may be called, has arisen practically within 
the last 200 years by the admixture of other pre-existing races. 
After Prof. Mahalanobis has discussed my measurements on 
mathematic lines, I hope to have an opportunity of considering 
other aspects of the somatology of this interesting community 


We hope thus to throw some light on the question of the origin of 


human races by fusion. : 
N. ANNANDALE, 


Directoy, Zoologicai Survey of India, 
Calcutta. 
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SECTION I. GENERAL REMARKS. 


In the present paper I have attempted a statistical exami- 
nation of Anglo-Indian Stature based on Dr. Annandale’s records. 
The measurements were all taken by Dr. Annandale or in a few 
cases under his direct supervision. ‘Thus the present material may 
be considered free from large fluctuating errors due to different 
personal bias of different observers. 


NATURE OF THE MATERIAL, 


Dr. Annandale has explained in his introductory note the 
special character of the present material. After excluding 
“Negro,” ‘‘ West Indies” ‘‘ Chinese,’’ ‘‘ Burmese’”’ and ‘‘ Bhutia~ 
ancestry and omitting certain incomplete and doubtful records a 
series of 200 was obtained for Stature, Head Length, Head Breadth, 
Nasal Length, Nasal Breadth, Zygomatic Breadth and Upper 
Face Length. ' 

The great importance of the present material from a biometri- 
cal standpoint will be easily appreciated. So far as I am aware 
this is the first time that a true biologically mixed population is 
being studied by statistical methods. 

From the statistical standpoint the coefficient of variability is 
considered to be a very important test of homogeneity.? Hitherto 
all attempts to fix the upper limit of homogeneous variability were 
necessarily confined to the study of artificially made up mixtures.’ 
The Anglo-Indian data furnish us with a “natural mixture.” A 
careful study may be expected to throw considerable light on this 
vexed question. Incidentally, it will be of great interest to com- 
pare the variability of such a ‘“‘mixed’’ population with those of 
‘* purer ’’ races.* : 

The Anglo-Indian population may really represent a new 
‘“race’’ in the making, and we hope to discuss in the sequel what 
indications may be afforded by a study of the present material as 
regards the mechanism of race formation. 

It should be noted however that the word ‘‘race’’ is here used 
in its statistical sense. Pearson® says, “‘ Any race may originally 
have arisen from a mixture of races, but such a mixed race is 
wholly different from a mixture of races, which have not interbred.”’ 


' Arithmetical work on these characters is nearly finished and I hope to 
publish the results at an early date. 

_  *% This is true of course for uni-modal data only, or more generally for distribu- 
tions which cannot be dissected into component frequency groups. For a fuller 
discussion of this point see pp. 34, 93-94. 

_ § C.S. Myer’s—Man, February, 1903, pp. 28-32. Also see Karl Pearson's 
discussions on this point in Biometrika Vol.2, 1903, pp. 345-347, Myers’ Reply 
and Pearson’s Remarks on this Reply in Biometrika Vol. 2, 1903, pp. 504-508. 

* “ Purer'’ in a statistical sense, i.e. more homogeneous, 
5 Biometrika Vol. 2, 1903, p. 5006. 
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The special significance of the present material is that it does re- 
present a mixed race which has interbred and whose component 
races are still in a pure form. 


PLAN AND SCOPE OF THE PAPER. 


Dr. Annandale took a very large number of measurements 
extending to forty different characters. But the records are not 
complete in each case. As I have already mentioned a series of 
200 has been obtained for seven! metric characters. A second 
group” consists of about 120 to 180 and a third® of 50 to 100 
complete records. In addition eye and skin colour were recorded, 
as also observations on hairyness in all cases. 

In the present paper the frequency distribution and variability 
of stature has been discussed at some length. Certain points 
have been considered in great detail, much of which it will not be 
necessary to repeat in subsequent parts. 

The second part (material for which is nearly ready) will contain 
a Study of the frequency distribution and variability of individual 
organs included in the first group. Correlation between the or- 
gans of the first group will be next discussed and after that the 
study of the second and the third group will be taken up. Finally 
I hope to discuss the distribution and correlation of eye, hair and 
skin-colour in a separate paper. 

I should make my position quite clear; I frankly confess 
that I know very little of anatomy. My work on the data supplied 
has been purely statistical. 

Some of the results may appear to be thoroughly unconvention- 
al or sometimes perhaps even startling in character. With such a 
short series, it is of course impossible to lay emphasis on the 
numerical value of any particular constant. But I would like 
to draw the attention of Anthropologists to statistically signi- 
ficant magnitudes as not unworthy of careful study. I have 
contented myself with pointing out statistical results and have 
refrained from drawing Anthropological conclusions. 

The chief object of the present study is to invite the attention 
of Physical Anthropologists of India to the importance of the 
application of accurate statistical methods to their ‘‘crude’’? mea- 
surements. As some of the technical terms may be unfamiliar 


! Stature, Head Length, Head Breadth, Nasal Length, Nasal Breadth, 
Zygomatic Breadth, Upper Face Length. 

2 (i) Gonial breadth 181. (ii) Frontal breadth 142. il) Shoulder breadth 
171. (iv) Thigh breadth 171. (v) Height of knee-joint, inside 174. (vi) Height 
of knee-joint, outside 120. (vii) Height of middlle finger 132. (viii) Styloid 
height 167. (ix) Trochanter height 180. (x) Iliac height 175. (xi) Upper 
radius height 118. (xii) Suprasternal height 119. (xiii) Acromion height 181. 
(xiv) Leg length 174. (xv) Chest, extended 137. 

* (i) Total face length 93. (ii) External orbital breadth 93. (iii) Ocular 
breadth gi. (iv) Distance between eyes 87. (v) Chest, depressed 88. (vi) 
Kneeling height 87. (vii) Sitting height 93. (viii) Earhole height 87, (ix) 
Span of arms 93. (x) Cubit 87. (xi) Handlength 76. (xii) Humerus length 48. 
(xiii) Radius length 48. (xiv) Foot length 78. (xv) Foot breadth 78. 
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to Anthropologists, I have thought it advisable to include short 
explanatory notes, which would have been unnecessary in a purely 
Biometrical paper. 

I must also offer my apologies to the trained statistician. 
Much of the work will no doubt appear to him to be quite superflu- 
ous. I would remind him that one of our objects has been to 
persuade the Anthropologists to adopt statistical methods. This 
has necessitated detailed consideration of certain points which 
may appear obvious to a trained statistician. 

For example, a very full discussion of the effect of grouping 
has been given. All frequency constants were calculated several 
times over with very different units of grouping. It is then shown 
that the effect of grouping is quite negligible within very wide 
limits—a result which is of course quite familiar to all statisticians.! 
But as I found very wide-spread popular misapprehension regard- 
ing this point I have considered it desirable to give an actual 
empirical demonstration of the above fact. The discussion of 
various ‘‘ correction’’ for grouping will have its own interest to the 
statistician. 

Another consideration has guided me in this introductory 
paper. Any extension of a scientific method to new material 
requires caution. Our Anglo-Indian data cannot be assumed to be 
homogeneous in character, hence I have thought it desirable to 
justify empirically the application of statistical methods to such 
mixed data as the present material. The assumption of “‘ nor- 
mality ’’ (i.e. of approximately Gaussian distribution) thoroughly 
permeates many important statistical methods. It was therefore 
necessary to investigate the question of frequency distribution in 
great detail. 

The arithmetical labour has been very great specially as I 
did not have any modern calculating machine to help me. ‘This 
want of mechanical accuracy may have introduced some uncer- 
tainty in the arithmetical results and this is why I have quoted 
the arithmetic very fully in order to facilitate checking by others. 
In the case of important ‘‘ moments,’’ I have checked them 
absolutely by working with different start points (i.e. different 
base numbers). 

This is my first venture into the province of Biometry and 
it is not unlikely that I have made mistakes. I have included 
full details of the statistical work in the hope that competent 
Biometricians will kindly help me by pointing out errors. I 
have retained six places of decimal in the arithmetic, not in the 
vain hope of reaching an impossible degree of accuracy, but for 
convenience of checking. It is difficult to attain agreement to the 
second place in the final results unless about six figures are 
retained in the intermediate calculations i in this type of work. 


: K. Pearson, rae ob Judgment &c. ; Phil. Trans. Roy. Soc. Vol. 198A 
(1g02) “ Aucaie Mating in Man. Biometrika Vol. 2, 1903, p. 485. The 


authors note that ‘‘the system of grouping adopted is within wide Vinis imma- 
terial,”’ 
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I have intentionally made the present analysis very elaborate. 
A total of only 200 observations did not perhaps merit such close 
scrutiny. As there was no early prospect of increasing this total 
considerably, I thought it better to complete even a provisional 
investigation thoroughly rather than wait indefinitely for a larger 
sample. But the chief reason which prompted me to make an 
intensive study of the small available amount of material is this, 
so far as I am aware no work in this line has been done in India, 
no Anthropoiogist in India has ever made any use-of the modern 
statistical calculus associated specially with the name of Karl 
Pearson and the Biometric School. The present study is intended 
to illustrate the urgent necessity of the application of statistical 
methods to Anthropology. The conclusions based on only 200 
observations cannot of course claim any degree of finality. But 
these serve to show the kind of results which can be reached 
by statistical methods and also show the great scope and huge 
possibilities of statistical methods. 


REMARKS ON THE APPLICATION OF STATISTICAL, METHODS. 


Before proceeding to the more systematic part of the work I 
wish to make a few general observations on the application of 
Statistical methods. I cannot do better than begin by quoting 
some remarks of Charles Goring in this connection.' 

“* Statistical enquiry, all scientific enquiry, is observational in 
character: that is to say, it is based upon the observation of in- 
dividual facts. But these facts, in themselves, do not constitute 
knowledge. Knowledge consists in the discovery of relation- 
ships revealed by the systematic study, and by the legitimatised 
weighing of facts.’’ 

“No series of biological or social observations constitutes 
knowledge in itself. Knowledge lies potential in the facts, but 
ineffectual for use until their associations with each other have 
been accurately weighed. It is the weighing of observations 
which demands for the present enquiry, the employment of statis- 
tical methods: such methods being merely a regulated mechanism 
by which the relation between certain order of facts can be precise- 
ly determined.’’ 

“ There is not, as is sometimes imagined, any special theory 
ot hypothesis involved in conclusions revealed by statistics. The 
science of statistics provides only for the systematised study and 
legitimatised interpretation of observed facts: such interpretation 
consisting mainly in one and the same process—the associating or 
dissociating one set of facts with and from another. Before any 
association can be legitimately postulated, certain conditions must 
be fulfilled ; evidence must be produced to show that the relation, 
affirmed to exist, isnot a chance or accidental, but a natural asso- 


! Charles Goring, The English Convict, pp. 19-20 (H.M.S.O. 1913) 
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ciation ; that it is not one resulting from coincidence, but that it 
represents an inseparable connection hetween natural phenomena.’’ 

‘The attributes and conditions of living things are so widely 
variable, are so delicately graduated in different individuals that 
their correlation can seldom be legitiinately postulated, and can 
never be precisely estimated, without aid from a correlation 
calculus: that is to say, social science almost entirely, and biolo- 
gical and medical sciences to great extent, can only be built up 
after preliminary mathematical analysis of large series of carefully 
collected data’’. This is the reason why we assert that statistical 
methods are indispensable for our present enquiry. 

We have. got Anthropometric measurements of 200 Anglo- 
Indians as our materia] in the present case. We know that this 
constitutes only a very small. sample of the whole Anglo-Indian 
population. We wish to investigate the Anthropometric charac- 
teristics of the whole population but we are constrained to do so 
from a study of the sample alone. If the sample exhibits certain 
typical features. we shall be justified in imferring the presence of 
these typical features in the general population. Thus our first 
statistical task is to find out the typical features of our sample. 
In order to do so, it is necessary to describe the given sample by 
means of a suitable typical curve, that is, to graduate the given 
sample suitably. 

This very process of graduation itself will “‘ smooth out ”’ the 
irregularities peculiar to the particular sample considered. Hence 
when a typical formula is once obtained we get rid of the special 
individual peculiarities of the given sample and can replace the 
given sample by our graduated curve in all subsequent discussions. 
This graduated curve is, by logical induction, assumed to be typical 
of the whole population. 

This typical frequency curve is defined by certain statistical 
constants ' calculated from the measurements actually given in the 
sample. The reliability of each constant is determined by the 
internal consistency or uniformity of the particular set of measure- 
ments from which it is derived (and the total number of measure- 
ments). The reliability (measured by the probable error) can be 
precisely calculated with the help of the statistical calculus based 
on the theory of probabilities. 

Thus in any statistical enquiry the first part of the work con- 
sists in the determining of the appropriate frequency constants 
and their probable errors. This is done in section II of the pre- 
sent paper, which also contains an elaborate technical discussion 
of the effect of grouping. } 

The next part of our work consists in constructing a type 
which is assumed to be true for the general population, within the 
limits of the probable error of the type. ‘This is the problem dis- 
cussed in section IV. 


! | have given a short account of some of these constants in non-technical 
language in Appendix I. pp. gyo—94. 


1922.] P. C. MAHATANOBIS: Analysts of Stature. II 


Once the typical curve is built up we can proceed to compa- 
rison with other general populations as represented by their own 
typical formulae. Goring observes “‘ no valid comparison between 
two series of statistics is possible until the constants of each series 
have been determined.” ! 

But even then, no conclusion can be safely asserted from the 
comparison, until a certain condition has been fulfilled. ‘“‘ Before 
drawing conclusions from the comparison of statistics, we must be 
certain that we are dealing with strictly random .samples of the 
same homogeneous material’ (italics mine). 

This introduces the second part of our work. For valid 
comparison we must investigate the homogeneity (or otherwise) of 
our material. I have discussed the statistical tests of homo- 
geneity in section III, and the application of these tests in 
section V. 

We then pass on to the question of comparison with other 
data. In section VI, I have considered the nature of the material 
for comparison and in the next section (section VII) I have in- 
vestigated the question of comparative homogeneity in great 
detail. 

In section VIII, I have added a preliminary note on the 
variation of stature with age. I shall discuss the question of age 
correlation and growth in a later paper. 


| Cf. Goring, p. 33. ‘In order that complex groups such as two series of 
measurements, may be compared, these have to be reduced to a simple form, to 
the genius, as it were, of the series, i.e. certain values, called constants (the 
mean, mode, standard deviation, etc.), have to be extracted; and the groups 
compared through the medium of their constants. These values, however, are 
only themselves comparable in certain conditions. First, we must know that the 
statistics they represent are not chaotic in their distribution that the sequence of 
their frequencies have been determined by law. And, secondly, we must know 
the range of error to be discounted before any actual differences between the 
constants compared may be regarded as significant. Before we can assert that 
one series of measurements inherently differs from another, we must predict and 
allow for a certain amount of difference or arithmetical inexactness, which, 
according to the law of probability, is bound to appear in limited samples of the 
same homogeneous material. This predicted amount of insignificant difference 
is called, as we have already said, the probable error of the constants under 
consideration.” 

‘“ Briefly resumed the matter stands thus: we must compare,. . . . not this 
o: that particular measurement, but the whole series of measurements obtained 
from a random sample of (one population) with a similar whole series obtained 
from a random sample of (another) population. In order to make this compari- 
son two things will be necessary: we must extract from each series its statistical 
constants, the mean, the standard deviation, etc.; of the series: and by the 
theory of probability, we must determine for each constant obtained, its probable 
error. These constants, with their probable errors, will be the representatives 
of the series, which, through their medium, become comparable with each other. 
If the differences between the results compared are not greater than the probable 
errors of these results, such differences may be regarded as insignificant: if the 
difference is not greater than twice the probable error, it may be regarded as 
probably insignificant ; and if it is not greater than three times the probable error, 
it may be regarded as possibly insignificant. On the other hand, if any differ- 
ence found is greater than three times the probable error, it is reasonable to 
assume that the difference is due to some definite influence over and above those 
causes which are inherent in the sampling process.”’ 
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The raw material in the form of the actual measurements, 
has been included in Appendix II. 

‘“Tables,’’ throughout the present paper, have reference to 
the indispensable volume edited by Karl Pearson, ‘“‘ Tables for 
Statisticians and Biometricians’’ (Cambridge University Press, 


IQI4). 


Nore ON “ BIAS’’ IN RECORDING MEASUREMENTS. 


It is well known that different observers are affected with 
different ‘personal bias’ in taking measurements. In the present 
case the crude data showed an overwhelming preponderance of 
“even’’ readings as against “ odd”’ measurements. 

In the case of Stature, we find no less than 193 “even” 
reading as against only 7 ‘‘odd.’’ We have no reason to believe 
that Nature has any special preference for ‘‘ even’’ number of 
millimeters, hence, apart from personal bias and fluctuations due to 
random sainpling we should have had roo “‘ even” and ‘‘odd” 
readings each. Instead of this, we actually get 193 and 7. 

The presence of “‘bias’’ is obvious, but I have calculated 
the ‘‘ Contingency ’’! for the whole group of the above seven 
measurements. 


‘TABLE 1. 


Contingency for “ bias.’’ 


| | 
Theoretical Observed | : m—m',2 
ean value. _ value. a a! | (a : 
| | 

Stature ep ae 100 193 93 | 86°49 
Head Length... ee 100 174 74 54°76 
Head Breadth .. ie 100 ) 181 81 65°61 
Nasal Length .. bi 100 III II I‘21 
Nasal Breadth .. "~_ 100 93 7 0°49 
Zyg. Breadth .. ns 100 . 156 56 31°36 
Upper Face Length ‘i 100 105 ) 5 0'25 


hard x?=240'17 


—_ wt a a 


” 


The probability that “random sampling ”’ would lead to as 
large or larger deviation between theory and observation is given by 


P=e- } I+—+ += 


) 


x 
log P= — }x’. logge tog {r++} 


! Karl Pearson: Phil. Mag. Vol. L, pp. 157-175, 1900. 


‘ 
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240° 240°1 6°96 
log P= a log é + log;, I+ Sa | 
log. P = — 52°15225289 + 3°868282 
= 47°713029 
Thus P=5'17x 107%’, or the chances are 2x 10** to I against 
there being no bias. 
In the case of Stature the unit of grouping is greater than 


ro mm. and hence this preponderance of even values of millimetres 
is not a matter of great consequence. 


SECTION IJ. EFFECT OF GROUPING ON THE 
FREQUENCY CONSTANTS. 


FREQUENCY CONSTANTS AND PROBABLE ERRORS. 


The object of the enquiry contained in this section may be 
best explained in Karl Pearson’s words.’ 

“Tt is well known that if the distribution of errors follows 
the normal law, the ‘‘ best’’ method of finding the mean is to 
add up all the errors and divide by their number, the “ best’ 
method of finding the square of the standard deviation is to form 
the squares of the deviations from the mean and divide by their 
number. ccc These ‘‘best’’ methods become far too laborious 
in practice when the deviations run into hundreds or even thou- 
sands. ‘he deviations are then grouped together, each group con- 
taining all deviations falling within a certain small range of quan- 
tity, and the means, standard deviations, and correlations are 
deduced from these grouped observations. If the means, stand- 
ard deviations, and correlations be calculated from the grouped 
frequencies as if these frequencies were actually the frequency of 
deviations coinciding with the midpoints of the small ranges 
which serve for the basis of the grouping, we do not obtain the 
same values as in the cases of the ungrouped observations. It 
becomes of some importance what corrective terms ought to be 
applied to make the grouped and ungrouped results accord. This 
point bas been considered by Mr. W. F. Sheppard (who has pro- 
posed certain corrections), Thus corrected the values of the con- 
stants of the distribution as found from the ungrouped and grouped 
deviations will nearly, but not of course absolutely, coincide.’’ 


In this section I have calculated both ungrouped and grouped 
constants with widely differing units of grouping. The constants 
as corrected by Sheppard’s formulae have also been calculated in 
each case. By a comparison of the different constants we find 
that within very wide limits the effect of grouping is negligible. 

The Stature list was classified into groups of 50mm. ‘The 
base number is taken to be 1655 mm. and the moment coefficients 
were calculated as shown below.’ 

We get the following table for “ raw ’’ moments about 1655 :— 


! Karl Pearson: ‘‘On the Mathematical Theory of Errors of Judgment and 
on the Personal Equation,’ Phil, Trans. Roy. Soc., Vol. 198A, 1902, pp. 249, 
250. 

? For details, see K, Pearson: ‘‘On the Systematic Fitting of Curves, etc.”’ 
Part I, Biometrika, Vol. 1, 1902, pp. 265—303 and Vol. II, 1902, pp. 1—24. Also 
W. Palin Elderton ‘‘ Frequency Curves and Correlation,’ pp. 13—19. (C. and 
ie au 1917) and G, Udney Yule: ‘ Theory of Statistics’? (Charles Griffin 
& Co.) 
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1430—1480.. 
1480—1I530.. 
1530—1580.. 
1580—1630.. 


ee | 


1680—1730.. 

1730—I780.. 
1780—1830.. 
1830—1880.. 


TOTAL 


Mid-Ordinate. 


DS 

1S) 

a 

vo 

a | xy | aty | xy 

2 | 

= 

lI | 

“ | 
3 | —I2 | 48 —192 
See aleens 7 od 
14 | —28 56 —I112 
45 | —45 45 —45 
60 |—100 —484 
48 48 48 48 
20 40 80 160 
3 9 27 8 
2 8 32 128 

| + 105 +417 
200 | +5 381 —67 


tery xb 
7 68 |—30 72 
4 05 |—12 15 
2 24 |—24 48 
45 —45 
—47 80 
48 48 
3°20 6 40 
2 43 7 29 
Be i2 20 48 
+ 34 65 
25 65 


es 15 


aory 


2 OS) O- 


Dividing by the total, 200, we get for the ‘‘ raw’’ moments, 
S denoting a summation for all groups. 


p25 a 
2 

was oY 
3 
4 

n=3 = 
6 

— 

jotta CAVKS 

Vg = N = 


The true Mean is given by 
1655 + (‘025 x 50) = 105615 mm. 
Transferring! to the true Mean with the help of ;— 


pe a 8 Aa | 
p= Vz — Vy 


Py= v3 —3y'v, + 2r,"° 


+ § '025 
+ 1°905 
Sy) Pos 
+ 12°825 
== 0575 
+ 142°905 


/ Lem A 1% , U 
By=Vy — 4v,'v3 + 6v,"*v,' — 3r,"4 
-s= Ve. a 5v, "V4, ¥ lov,*v3" “F Lov; °v," + Ages 


1 Karl Pearson: ‘‘ Contributions to the Mathematical Theory of Evolution— 
On the Dissection of Asymmetrical Frequency-curves,’’ Phil. Trans. Roy. Soc., 
Vol. 185 A (1894), p. 71. 
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we get moments about Mean (without correction) 
Ba= 1°90 435 79 

47 78 Sie 
pa= 12°86 56 42 58 
Hp= —818 04 93 98. 

The moments were checked by calculating the “‘raw’’ 
moments about 143'0 cm. (end of range) as base unit The 
‘‘raw’’? moments were 

v= —4'52 5, v,'= — 22°38, vs’ = —118'02 625, v4 = — 657-42 75, 
v,/ = — 3846°6203125, 
but after transferring to the Mean, the same values as before were 
obtained. = 

The Standard Deviation! (S.D.) is given by o=V jis 

Thus g=+ 1°38 in working units 
= +69'°00 mm. 


Il 


¢ 


The Coefficient of Variation * (V) is defined by a and we get 


V =-4 1000; 
We must now proceed to find the other frequency constants ® 
By =P3)/P3° B,= ‘033204 
B= by [My B= 3547534 
Skewness=Sk.= ‘069858 
VBi(Bs + 3) 


where skewness = 


2(58,— 6B, —9) 
Distance between Mode and Mean=d=c x skewness. 


It is now necessary to find the Probable Errors.* 


! Also See Appendix I. 
* Karl Pearson: ‘‘ Regression, Heredity and Pan-mixia,’’ Phil. Trans., Koy. 
Soc. Vol. 187A (1896), p. 203. See footnote on p. 34. 
® (i) Karl Pearson :—'' Skew Variation in Homogeneous Material,’’ Phi. 
Lvans., Roy. Soc. Vol. 186A (1895), pp. 343—414, Supplement, Vol. 
.. .. 197A (1901), pp. 443—459. 
(11) Karl Pearson: ‘On the Mathematical Theory of Errors of Judg- 
ment,’ Phil. Trans., Roy. Soc. Vol. 198A (1902), pp. 274—279 and 
P- 277: 
(iil) ‘Skew Frequency Curves,’’ Biometrika, Vol. 4 (1905), pp. 169—212; 
Biometrika Vol. 5 (1906), pp. 168—171 and pp. 172—175. 
(iv) W. Palin Elderton :—« Frequency Curves and Correlation’’ (Charles 
and Edwin Layton, London) with Addendum and Errata, 1917. 

+ The fundamental memoirs are Karl Pearson and L. N. G. Filon: (a) ‘‘On 
the Probable Errors of Frequency Constants and on the Influence of 
Random Selection on Variation and Correlation,’ Phil. Trans. Roy. 
Soc., Vol. 191A (1898), pp. 229—311. 

(4) W. F. Sheppard: ‘‘On the application of the Theory of Error to cases 
of Normal Distribution and Normal Correlation,’ Phil. Trans. 
Roy. Soc., Vol. 192A (1899), pp. 101—167. 
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The Probable Error of Mean ! 


_ 6744898 
= SS oH YC. 
- vn 
Probable Error of Standard Deviation 
at ae 
a/ 2n E 


Probable Error of Coefficient of Variation 


"6744898 Peyeae 
= Sv { r+ as } 
an a ; 


We find, Probable Error of Mean =0°32906 cm. 
Probable Error of S.D. =0'32267 cm. 
Probable Error of V = 0'14166. 
The Probable Error of S.D. requires correction for skewness.” 
pene re. of S.). 
7 OAS OD. car) ie hs 
eee a Wal ta 200 8)} 


. . : . o 
which reduces to the usual expression involving —Z for normal 


/2n 

curve, since ®,—3=0 approximately in this case. Making this 
correction we get P.E. of S.D.=0°3643 cm. This correction has 
been made in all subsequent work, but the difference made is not 
considerable in any case. 

The probable errors of 6; and /,, skewness and d were found 
from Table XX XVII, XX XVIII, XI and XII pp. 68-77 of Tables 
for Biometricians.® 


Probable Errors of f;. Table XXXVII p. 68. 
8, ='0332 
= JN 3g, =0 + 2 (1°37) =0'9069 
B; wi = z By ZF 500 


(c) ‘On the Probable Errors of Frequency Constants,’’ Biometrika Vol. 2 
(1903), pp. 272. 

(d) Karl Pearson: ‘‘On the Mathematical Theory of Errors of Judg- 
ment,” Phil. Tvans. Roy. Soc., Vol. 198A (1902), pp. 274—279. 

(e) ‘‘Probable Errors of Frequency Constants,’’ Part II, Biometrika, Vol. 9 
(1913), PP: 

1! Tables were published by W. Gibson and Raymond Pearl (Biometrika 
Vol. pp. 385—393) to facilitate the calculation of probable errors. These have been 
now reprinted as Tables V and VI in “‘ Tables for Satisttcians and Biometricians ”’ 
(Cambridge University Press, 1914). 

2 Karl Pearson, Editorial Note on a paper by Raymond Pearl: ‘‘ On Certain 
Points concerning the Probable Error of the Standard Deviation,” Biometrika 
Vol. 6 (1909), p. 117. 

8 These tables were originally published by A. Rhind in Biometrika Vol. 7 
(1910), pp. 126-147 and pp. 386-397. Rhind gives an excellent summary of the 
whole subject. 
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B.=3'34.75, WN 3~= —0°9069 + - = ( 0861) 


= ey 


Multiplying by x,="67449//n — = 04769 
we get, — P.E. of §,;= © °045207. 
Then from Table XX XVIII, p. 71. 
B\= *03 32 


B,=3'5 sath = (0°9) =114458 | 


3°6 =a og. ee = (I: 07) =13° 3783 a : _ 
For B,;=3°54 74; ‘1144 58 os of 9325) 
/ NE ap, = ee 
hence P.E. of B,= 5°89625 


From Table XLI, p. 76. | a 
By =3'5 WS N3oh= | ce pes re, 02) = =I" 30 i ¥ ‘ 


5: 2 ae 
3°6 32° 4= eee o2=13t | 87 a 
500 Me rane 


P.E. of Skewness = 06 26 oe 


We thus find | | 
Mean, M=1656'25 +43°2906 mm. 
SD. ¢ 69:00 -£2°643T mm. e 


Coeff. of V, V= 416604 "1407 
The other constants are :— ods Lote te ai 
B\ = *03 32 044°04 52 OF 


B:=3°54 75 344°58 96 25 
Skewness=sk= ‘06 98 58+:06 26 a, 


We thus find that the skewness is not signi 
are justified in assuming normal sae hee? at 
approximation, _ 

On this assumption we can find the PEL of 
quite easily. 
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The S.D. of any moment /, in a sample of size is given by ! 


G2 2 
M. Zug” = Pog — Ma — 294) Part FPO Mg | 


; 2 
P.E. of ya — + Ma ='06 74 49", 
PB 6 3 

Of 1, ="O7449 a vas 68 16%03 


= ee 
P.E. of n= 6raan,/ 2 ie = 127 200) 5051," 


For g=5, we must find /o. 


But Sheppard*® has shown that for the normal curve (in our 
present notation) 


M2s+1 =O 
Paes = (255-2 (25 3) Ta, 
: 20 | 
Hence we get 24,=— Spee 


Substituting in the above formula, we get 
aise Bot. p,="67 44 py ae: 247) OO) 50:0" 


We thus get :— 
Pa= 1°90 43 75 +40°12 84 31 
Ma=— °*47 78 43 77+ °30 69 85 
Py= 12°86 56 42 5841'°46 74 85 
H5= —8°18 04 93 9846'40 45 50 


SHEPPARD’S CORRECTION. 


I shall now consider the question of corrections for grouping. 
The theoretical work in this subject now consists of a good deal of 
literature. I shall discuss this question from a purely practical 
point of view. The fundamental memoir is W. F. Sheppard®: 
“On the Calculation of the most Probable Values of Frequency 


~ 1 Qn Probable Errors of Frequency Constants,’ Biometrika Vol. 2 (1903), 
B.,.270: 
2 W.F. Sheppard: Phil. Trans. Roy. Soc., 192A. 

_° (a) A summary of Sheppard’s memoir (with some new results) is given in 
an Editorial Note: ‘‘Onan Elemementary Proof of Sheppard’s Formulae for cor- 
recting Raw Momentsand on Other Allied Points”’ in Biom. Vol. 3, pp. 308—310. 

(6) In Pearson’s paper: ‘‘On Systematic Fitting of Curves, etc.’’ Biom. Vols. 
1 and 2, this question has been discussed from a different standpoint. 

(c) Sheppard himself has given a simplified method of obtaining certain cor- 
rections in a later paper: ‘‘The Calculation of Moments of a Frequency- Distri- 
bution,’’ Brom. Vol. 5 (1907), pp. 450—459: 

(d) Eleanor Pairman and Karl Pearson have published a memoir: ‘‘On Cor- 
rections for the Moment-coefficients of Limited Range Frequency Distributions 
etc.”’ in Biom. Vol. 12 (1919), pp. 231—338, which I shall have. occasion to 

discuss later on in greater detail. 
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Constants, for Data arranged according to Equidistant Divisions of 
Scale,’ > Proc. Lond. Math. Soc., Vol. 29, pp. 353—380. 

In our notation the above correction (which is known as Shep- 
pard’s aie is iv en by the following set of equations :— 


He =¥e —Z hv,’ +35 hv,’ 581, h8 
A is the length of the base unit, it is usually =z for working units. 
50 mm. unit of grouping. 


Making these corrections we find adjusted moments about 1655 
to be 

i oes, a, = 1°82 16 67, M3 = —0°34 I2 50, 

My =1I'90 16 67, p,'=—6'29 21 87, 44’ = 147°33 Q7 83, 
Now transferring to Mean we get 


Pg= 1°82 10 42 Ma=— °47 78 
u, = DI°O4..16 -22 Pre=—7'78 23 
Hence we finally get “corrected ’’ constants: 
Mean = 1656'25 +3°21 7 mm. 


S.D.= 67'47.3 +2°61 62 re 
Coeff. of Vj VS" #07 30.) eee 


B,= 103 78 104 “05 41 33 
b=. 3°60 DO Se 7 20 ae 
sk= "07 3I 10+ ‘06 22 32 


= 4°93 29 50+4'22 30 20 mm. 


Note.—Starting with 1430 as our base unit, we reach the same results, thus 
the arithmetic is absolutely checked in this case. 


The Frequency Constants were next calculated (both with 
and without Sheppard’s correction) for widely different units of 
grouping. Wehave Imm., 20 mm., 30 mm., 50 mm. and finally Ioo 
mm, as our unit of grouping. It will be observed that the unit of 
grouping is thus successively made the same, Io times, 20 times, 
50 times and finally roo times the unit of measurement. 

With ‘ ungrouped ’’ (i.e, I mm.) measurements, the arithme- 
tical labour is tremendous. In this case the maximum value of x 
is ~ 210, which involves calculating (210)* for the fourth moment. 
Hence it was not possible to go beyond the fourth moment. As 
it is, the actual sum of fourth-products, i.e., S(x*y) runs into II 
figures, I quote actual results 


S(xy )= r 58 
S(x*y) = go 82 72 
S(x8y) = —6 76 88 78 


S(x4y) = 144 04 28 60 67 


———— ——— 


: 
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which gives us (dividing by 200) :— 


ees ‘79 
v2 = 45 41 °36 
v3 = 3 48 44 °39 


wm 72°02 14 30,38 


For purposes of comparison it is necessary to reduce all 
moments to the same unit. 50 mm. was chosen as the standard unit.! 
Let 4» be any moment in units of grouping A, let M, be the 


h 
corresponding moment in standard units /,, let p=— 


0 


Then M,=p”. a, is the formula of reduction to standard unit. 


ey 2 : : 
For h, =50 mm., p=--, — 3 and 2 successively for units 


Berge 
of Imm., 20 mm., 30 mm. and I0oo mm. respectively. 

The annexed table gives the Frequency Constants for the 
different units of grouping. I have added the probable errors in 
each case. 

For the purpose of studying the effect of grouping it is natural 
to take the ‘‘ ungrouped ’’ constants as our standard. We have 
accordingly assumed that the 1 mm. constants are the “ true” 
constants. 

Different Values of Mean Stature. 


Unit of Grouping. 


I min. 16 56°79 + 3°23 mm. 
Zor 2 TO 50°05 2 9:23'° "3 
SO ss 06150135) E3525"), 
50.".,; EO 50°25 4 3°22, 
Toots, 16 59°50 + 3°09 ,, 


When the unit of grouping is so large as I00 mm. (and the 
total record is divided only info 5 groups), there is considerable 
difference in the Mean. But this difference of 2°71 mm. is less than . 
the probable error of over 3 mm. Thus even with 100 mm. group- 
ing, the Mean is stable within the limits of its own probable error. 

The agreement is almost perfect when we omit the Ioo mm. 

group. The maximum “error” due to grouping amounts to only 
54 mm., which is considerably less than the unit of measurement 
itself and is about 3 of the probable error. 

Let us consider a very large sample-of 7,500 individuals. It 
is not likely that the Standard Deviation will exceed 70 mm. The 
P.E. of Mean will be about *55 mm. The maximum observed 
difference in the present case, due to grouping, ts thus of the same 
order as the random P.E. of the Mean in a sample of 7,500. We 
conclude therefore that for samples of 200, the effect of grouping on the 
Mean up to 50 mm. ts quite negligible. 


! For reasons explained on pp. 39-40. 
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Standard Deviation. 


Let us first consider the results without Sheppard’s correction. — 


I mm. 67°385 + 2°557 mm. 
20 » 67°894 + 2°547 5, 

30% oc: 68°365 + 2°444 ,, 

BW ns 69:00 + 2°679 ,, 
[GO ‘fs, 70'922 + 2°662 ,, 


With 100 mm. the difference is quite large. 
which is considerably greater than the prob. error. 


It is 3°537 mm. 


Omitting I00 — 


mm. we find the maximum difference to be 1°615 mm., which is 


considerable, but is still less than the P.E. Such a P.E. will be 


obtained with samples of 400. Thus the agreement without Shep- 


pard’s correction 1s not very good. 
With Sheppard’s correction 


I mm. 67°385 + 2557 mm. 
20 ;, 67°648 + 2°539 5, 

aa as 67812 4 (2420) 3 

50» 67°473 + 2°619. ,, 
100 ,, 64°77. + 2°432 », 


I0o mm. is again discrepant. 


The difference is 2°615 mm. 


which is of just the same order as the P.E. Evidently 100 mm, 
grouping is too broad and the error due to grouping is no longer 
negligible. This is also obvious from the fact that Sheppard’s 


correction makes the S.D. actually less than its true value, while 


the uncorrected value is considerably greater. 

Omitting 100 mm. the agreement is excellent. The maxi- 
mum difference (which is now in the 30 mm. group) is only °427 
mm., a value about a sixth of the probable error. It will require 
a sample of 6000 to produce a random error of the same amount. 

Thus with Sheppard’s correction, the effect of grouping is 
quite negligible up to 50 mm. These corrections are so easily 
applied that there can be no excuse for omitting them. We have 
thus empirically verified the great importance of Sheppard’s 
correction in giving better values of the Frequency Constants. 
Henceforth it will not be necessary to compare the values obtained 
without Sheppard’s correction. 


Coefficient of vartation: va== 
I mm. 4:06 72 + ‘13 74 
20.8 4:08 29 + ‘13 79 
eee 4°09 41 + ‘13 83 
50° 7 4:07 38 + *13 76 
100% 3°90 29 + °13 18 


100 mm. is obviously incorrect, we may, omit this group from 


further consideration. 


The difference ‘1643 is greater than the. 
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P.E. Omitting roo mm, the maximum difference is 0269, which 
will be the P.E. in a random sample of 5,000 (with coeff. of varia- 
tion equal to 4). Thus the effect of grouping is of the same order 
as the effect of sampling in a group of 5,000. Hence we con- 
clude that different units of grouping do not introduce any appre- 
ciable errors in the Coefficient of Variation. 

From the Anthropological standpoint, the Mean, the S.D., 
and the Coeff. of Variation are the most important constants. 
For stature, with samples of 200 with Sheppard’s correction the 
effect of even such a large unit of grouping as 50 times the umt of 
measurement is in all these cases absolutely inapprecrable. 

We shall however consider the other statistical constants 
before concluding this portion of our work. 


Values of po. 
With Sheppard’ s correction :— - 


aan. Ar" Ot. 62° OFF “12° 25°05 
20 ;, Og. O4 OS 12 13477 
30 .;, 1°83 94 72 + ‘12 40 83 
BOl EjS2, FQ 420 412° 28 40 
FOOL. ;. 1°67. 87 + ‘II 32 39 


Ioo mm. makes a difference of -1376, whichis just about the 
same as the P.E. Otherwise the maximum difference is ‘0232 
which is only asixth of the P.K. A random error of the same 
amount will be produced in samples of 2800. 

Let us now compare the values obtained without Sheppard’s 
correction : 


Eanes FOL O24) O04. 4 12. 25° 07 
AON, 1°84 38 3% + ‘I2 43 09 
OMe, ESONQ4i7 2) 42 ro) Or 06 
Sony's E90: 43°75. = “12 84 60 
LOG <+g 2°01 20 EES a7. OF 


100 mm. introduces an error of *1958 which is considerably 
greater than the P.E. 3 . 

The effect of grouping has now become quite obvious, 20 mm., 
30 mm. and 50 mm. now introduce steadily increasing error. 

_ With 50 mm. the error has now amounted to ‘0881 which is only 
~ 2rds of the P.E. 

We thus see that Sheppard’s correction is absolutely indis- 
pensable here. With Sheppard’s correction the effect is quite 
negligible up to 50 mm. 

Values of fi. 

With Sheppard’s correction :— 


Imm. =— ‘64 36 06 + ‘28 59 
Boy ea, By AO. 720 27 
30 ,, =— ‘46 86 G7 + ‘29 86 
504. =~ “47 78 44 + °30 70 
LOOT ee 40 35, 70 + 33 35 


24 Records of the Indian Museum. [ Vox,.. Xone 


100 mm. is not at all worse than others. The maximum error 
(which now occurs in the 20 mm. group) ‘3349 just exceeds the 
rhe 


Without Sheppard’s correction :— 


20 mim. = "30 87°74. 25.08 
30 ,, = .46 55 + '29 14 
50. 4309s, 287 198 Bee 
TOO: 6; == "06.40 +a 725 a8 


Evidently Sheppard’s correction does not produce substan- 
tial improvements. In this case the gross P.E. of mz is of the 
same order as ps itself and hence there is wide fluctuation in the 
result. } 

In view of the large P.E. we cannot say that grouping makes 
any significant difference. ‘The asymmetry is very slight and very 
neatly zero, thus the fluctuations though large are not statistically 
significant. These wide fluctuations indicate the critical approach 
to the Gaussian curve. 


Values of Py. 


With Sheppard’s correction :— 


I mm. 11°56 10 21 * 1°54 15 
20 a II'56 54 26 + 1°56 59 
80° 3 10'97 Ir 78 + 1°58 08 
ais Spa II°94 16 22 + I1'54 97 
TG0 ,, 10°35 96 + 1°31 58 


roo mm. makes a difference of 1°2014 which nearly equals the 
P.E. Otherwise the agreement is good. The maximum error is 
59 (in the 30 mm. group) which is much less than 4 the P.E. 
Random error of the same amount will require samples of 1300 
individuals. 

Without Sheppard’s correction the agreement is much worse. 
We have 


I mm. 11°56 Io 21 + 1°54 15 
6 ee T'70' 20 + 1°58 87 
30 5 II'30 37 + 1°63 31 
a0: 4, 12°86 56 42 + 1°69 46 
TO6..4;; 13°98 12 48 + 1°89 13 


100 mm. has become too “ rough’’ and 50 mm. itself introduces 
an error of about the same order as the P.E. Thus Sheppard’s 
corrections make substantial improvement in the results. The 
percentage probable error of », for normal curves is given by 


bx 96 = 15'7% in our case. In view of this large percentage varia- 


noe observed agreement with different grounings is quite satis- 
actory. 
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Values of bs. 


With Sheppard’s correction :—" 


30 mm. —II'Q2 55 72 + 5°87 II 

50 ,, ee Fae 28, a = 5572) 74 

TOO" ;; —II'0g 49 34 + 466 78 
Without Sheppard’s correction :— 

50 mm. — 8-18 04 94 + 6:40 46 

Ioo ,, PREAH ney 73Ai 05 


The gross prob. error is again of the same order as pr, itself. 
Hence there is very wide fluctuation in its value and Sheppard’s 
correction is not important. It should be noted however that 
even now the maximum difference (inter se} is less than the P.E. 


Vaitues of (6. 


30 mm. 121°83 05 + 29°93 43 
50 £5473) 13 20/03,.98 


The percentage P.E. for normal curve is 2.0/ 4°80 95 =32 6307 
With such large percentage variation tt 1s quite idle to calculate the 
higher moments dir ectly. 

Pearson says in this connection’ “‘Constants based on high 
moments will be practically idle. They may enable us to describe 
closely an individual random sample, but no safe argument can be 
drawn from this individual sample as to the general population at 
large, at any rate so far as the argument is based on the constants 
depending upon these high moments.”’ 


Values of f,. 


I mm. "06 87 56 + ‘07 97 81 
200t "OI 55 38 + ‘OI 93 24 
30") 3; IO 5g) a O35 68 
5° » 163,78" Loy 06° 55.55 
FOO: "04 94 90 + ‘06 31 


2 
Remembering that i , we are quite prepared for such 
wide fluctuations. It will be seen that (, differs from zero by just 

about the same amount as its own P.E. (calculated separately for — 
each) which of course implies that there is a tendency towards B, 
differing slightly form zero, but that with a small sample of 200 
this tendency has not become quite significant. The unit of 
grouping does not make any difference so far as this tendency is 


! On account of the great Arithmetical labour, it has not been found possible 
to calculate us and ug with lower units of grouping. 

? Draper's Company Research Memoirs ; “On the Genera! Theory of Skew 
Correlation and Non-Linear Regression,”’ p. 9. 
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concerned. With 50mm. without correction, 8; is="03 32 04+°04 
52 01. Thus Sheppard’s correction is not important. 


Values of Px. 


I mi.-*. 335045 + ‘60 17 
20° = 3°45 16 21 + '49 72 97 
30.» 3°24 24 + ‘35 44 89 
So ek 3°60 10 CO + ‘7I 20 69 
100 ,,.» 3°45 36) * Me aR a2 
50 >, 3°54 75 34 + 58 96 25 (without correction). 


Though £, does not seem to differ significantly from 3, thereis 


slight tendency towards lepto-kurtosis.! 

The P.E. of 8, for a Gaussian distribution is y,-~/24 and is 
about +'23 in our case. The magnitude of P.E. again shows the 
want of significant divergence from meso-kurtosis. 

The effect of grouping is evidently quite negligible. The 
above investigation has been most elaborate in character: and is 
sufficient to justify the application of ‘‘grouped”’ statistical 
methods to our present material. 

The foregoing analysis may be summarized thus :— 

(rt) With samples of 200, even such broad grouping as 100 mm. 
does not introduce errors greater than the random error of sampling. 

(2) Up to 50 mm. the effect of grouping is absolutely negli- 
gible. In the case of the Mean, the §.D. and the Coeff. of Varia- 
tion, “ grouping error ’’ is of the same order as ‘‘ random error’’ in 
samples of several thousands of individuals. 

(3) Sheppard’s correction leads to a very substantial improve- 
ment in the $.D. and the even moments. The odd moments 
(being near a critical value) are not affected very much. Speaking 
generally, Sheppard’s correction should never be omitted. 

(4) The percentage variation in the higher moments is too 
large to make it worth while calculating them directly. 

I speak with hesitation about another inference which may 
perhaps be drawn from the above investigation. Small errors of 
estimating stature—even up to perhaps a few mm. are not likely to 
affect the Mean value very considerably (provided these errors are 
random errors and not systematic). 


“ RuLL CORRECTIONS ” OF PAIRMAN AND PEARSON. 


We shall now consider certain “ full corrections’’ recently 
discussed by Pairman and Pearson.” ‘The object of the above 


| K. Pearson: ‘‘ Skew variation, a Rejoinder'’ Biom. Vol. 4 (1906), p. 175 
Also appendix II. 

2 Eleanor Pairman and K. Pearson: ‘‘On Corrections for the Moment- 
Coefficients of Limited Range Frequency Distributions when there are Finite or 
Infinite Ordinates and any Slopes at the Terminals of the Range.’’ Biom. Vol. 
12 (1919), pp. 231-258. 


} 
q 
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paper was to investigate the full corrections for curtailed blocks of 
frequency. 

The general shape of our curve showed that there was no 
significant curtailing, still I thought it advisable to investigate 
this point more carefully. 

We choose 50 mm. unit of grouping as our standard and find 


“yaw ’’ moments about one end of range, i.e. 1430 mm. 


Stature in | Frequency 
mm. =i 
Pn 1186 oe Raw Moments are:— 
1530 | 5 } y= 45250 
1580 | 14 Bie 22 3° 
1630 | 45 Va = 1160262 
1680 | 60 vy =657°4275 
1730 | 48 Note:—These lead to the same moments 
1780 | 20 about Mean as obtained from raw moments 
. about 1655. Hence there is an absolute check 
1830 | 3 ~on the Arithmetic. 
1880 | 2 
TOTAL .. | 200 
Instead of working with ,’, n,’ ... (the proportional 
frequencies), we can work with y,, y,, ... the actual frequen- 


cies, and then divide the whole by 200. Thus we get the follow- 
ing (slightly modified) formuiae from p. 233 of the paper cited 
above. 


@\ = — 355 + o{103y) — 1O3y, + 137Y3 — O34 + 1295} 
@,= +500 -Tz{ 45¥;— 109¥, + 105¥3— 51Yy + LOys} 
a5-4 { 17M, — 54¥2+ 64¥3-34¥y+ 795} 
ay= +300 Cai S87 15y3— 9%, + 2y;) 
a,= — 390 { Yi— 4¥2+ 6ys— 4¥yt Vs} 


b= +200 so (137Yp — 163yp-) + 137Yp-2— O3Yp-g + 12Yp- 4} 

a= —zo0 val 45Yp — 10QYp-1 + LO5Yp-2— 51Vp-g + 1OYp- 4h 

6, = + sto i.f{ IJV, = 54Vp-1 + O4Vy-35— 34Vp-3 + 7p -4} 

b,= —sho { 3¥Yp— IlVp-1+ I5Vp-2— WWe-3t+ 2Vp-s} 

b= +sa0 f VYp- 4¥p-1+ SYy-2— 4¥p-3+ Yp-a} 
In our case 


W—34e¥5 =5, Ve =—14. V4 =45; Vp = 60 
Vp = 2, Vo-1=3) Vp-2=20, Vp-3= 48, Vp-4= 60 
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Hence we obtain 


a,= +'05 00 83 b,= +‘o1r 86 67 
a,= —°26 45 83 b,= —'00 62 50 
an,= $54 22.50. b= —"07 50 00 
a,= —°60 50 00 b,= +‘I9 50 00 
a, = +°20 50 00 b, = — 10 50 00 


From these we obtain :— 


A,=a,'—95 43) tas %s = +'04 II 60 16 
B,=b)'- 9503" +aeua0; = +°01 98 75 00 
A, =4@,' —735 My = —"24 05 75 40 
B,=b,’ —8e by =— ‘OI 39 88 09 
A,=a,'—¢; 43 +gigds = +°00 82 31 24 
B,=b'— 35 by +5350; = +02 41 BI 55 
A,=@,'— 95% = —‘21 16 45 80 
B,=b,' — gy by =—'02 38 75 00 


From Equations (xxii) to (xxv) on p. 240, we get the fully 
corrected raw moments to be :— 


py =v, +33{4,+ By} 
Po =e — ds + z40 (B,— Ag} 
Mg =v3'— gy, + {—Fo(43 + B;) + gob Ba + 4p’. B,} 
py =v, — bv, + oda + (zre(4,—B,) —ToPB; + 2o0°B, + 16* . Bi 
In our case the range /=9, and we get : — 
fr se +{'00 50 86 26} 
My =vz— qs  +{°03 17 00 69} 
3 =Vg — Z vy + {739 12 18 23} 
Hy =Vy'— 4 ve’ + (45 67 19 58} + ai5 
Where the curled brackets give the correction over and above 
Sheppard’s correction. 
‘Thus we get fully adjusted raw moments to be 
p)’= 4°53 00 86 26 30 
He = 22°32 83 67 35 85 
Mg =117'28 62 18 23 12 
Hy =646°72 33 86 24 84 
Transferring to the Mean (which itself is now changed) we 
obtain the Moment-Coefficients about the Mean. 
Moments after ‘‘ full correction ’’ 
Po = 1°80 66 86 
g= — 0°23 20 97 
a a Bk Be Be, 
and the Mean=1656°5043 mm, 
with S.D. = 67'I1950 mm. 
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Comparing with our “‘standard’’ values we see evident signs 
of “over correction.’’ With such small samples as 200, the P.F. 
in terminal frequencies are too great to allow the a’s and 0’s to 
be calculated with any degree of accuracy. The transfer of one 
individual from one group to another would seriously affect the 
results. 


In order to test this point, I next calculated the a’s and b’s 
with a shorter sub-range, i.e. 40 mm. 
hs 50 ‘ 
Thus Ly: ee 
40 


Hence 


1430-1470 | —-I§10 | -1550 | —I1590| —1630 | —1670 | -1710|-1750 | —-1790 | —1830 | -1870 mm. 


i 


Vy) eS Weg: | V4 V5 -- | Vp-4 | Yp-3| Yp-2 | Yp-1 | Vp 
2 25 ma 18°5 | 38°0 | 51:0 | 4o°0 | 24°0 | 15°0 ie) 2°0 
From these we get 

a,= +°00 17 50 6b, = +°09 36 67 
a, = —"04 83 33 b, = —°30 5 
d,;= +°I0 6,= +°50 5 
a= — ‘II b,= — "42 
a,= +°04 B.A 0 
leading to 
a,/=+-00 21 88 Oy rE 7083 
ay = — "07 55 21 b,’=— °47 65 63 
a, = +19 53 13 b,’= + “98 63 28 
a, = —°26° 85 55 b=, =1'02: 53.91 
&, = +*12 20.70 Be + 348082" Sr 
These give 
fg = 1°89 48 76 86 
giving Bel e625 
and Mean = 1656°66 58 mm. 


The values are again quite discrepant from those given above. 

With subrange of 25 mm. still more widely divergent values 
were obtained. 

Hence we are obliged to conclude that with small samples, 
the probable errors of the terminal frequencies are much too large 
to allow Pairman and Pearson’s ‘“‘full corrections’’ being calcu- 
lated with accuracy. 
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The general conclusion of the above thvesbieatinn is t. 

There is no indication of appreciable ‘ curtailing’ 
material. Further, with small samples, the ‘‘ abruptness 
ents ” cannot be calculated with any reasonable degree of 
and these “‘ full corrections’’ will necessarily have to b 
But we have already seen that Sheppard’s correction can t 
applied and should never be omitted. o 


Rat», ay 
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BeCTION. dt. ON, THE: STATISTICAL -TESTS OF 
HOMOGENEITY. 


One of the main objects of our present enquiry is to investi- 
gate the ‘‘homogeneity’’ of our material. For this purpose it is 
necessary to have some precise definition of ‘‘ homogeneity.’’ I 
fully realise the great difficulties underlying any attempt at such 
a definition, but in order to avoid confusion of thought I have found 
it impossible to forego at least a working definition. I shall 
approach the problem from a purely statistical point of view. 

‘*“ Homogeneity’ implies similarity and functional equivalency 
among the members of a group of any class of objects. When all 
the members are identical with respect to some definite property, 
homogeneity is perfect with reference to that particular property. 
This is the ideal limit of thought, but in practice it always remains a 
mere intellectual abstraction. 

Thus in actual practice diversity is always present. But if 
the similarity attains a certain intensity we can speak of the 
group as being homogeneous. The actual amount of similarity 
considered necessary to attain this intensity is of course a matter 
of practical convenience. A group which is homogeneous for one 
purpose may be quite heterogeneous for another.! 

“Homogeneity ’’ thus ultimately depends on our standard of 
discrimination. If the actual difference between any two mem- 
bers of a group is Jess than our unit of discrimination, we can 
never become aware of this difference and the group will appear to 
be homogeneous. On the other hand if the actual difference is 
greater, heterogeneity will become evident. If our unit of dis- 
crimination is made indefinitely small and yet no heterogeneity is 
detected, we gradually approach identity, which is the ideal limit 
of thought. 

The concept of “ homogeneity” is thus essentially relative 
and practical. Wecan never have any absolute logical criterion 
of homogeneity. We must set up separate standards of homo- 
geneity in each case. To this extent the definition of homo- 
geneity is necessarily arbitrary and conventional. But having 
once set up a standard we must rigidly adhere to it. We cannot 
give it up in the middle of a discussion on the plea of arbitrariness. 

The discriminant may be either qualitative or quantitative, 
in either case it should be precise and definite. 

We can now proceed to set up tests of homogeneity for our 
special purpose. 


: Cf. K. Pearson: ‘Skew Variation,’ Brom. Vol. 4 (1906), p. 176, 192 and 
p- 165. 

’ 2 e.g. In statistics, the probable error is the fundamental discriminant, 
in foo nial Psychology the least perceptible difference is the ultimate 
unit. 
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From the statistical standpoint our first necessity is suitable 
graduation of the given sample. This is necessary in order to 
draw legitimate inferences about the general population from a 
study of the given sample.' Our first condition is :— | 

I. We should be able to graduate the given sample by a smooth 
curve. That ts, the given frequency distribution must be homotypic ” 
$n character .* 

The goodness of fit can be tested by the Pearsonian Contin- 
gency Coefficient.’ 

Possibility of graduation by a smooth curve is thus a necessary 
condition of statistical homogeneity. ss | 

This is not however sufficient. All heterotypic curves are 
excluded, but a homotypic frequency curve need not necessarily be 
homogeneous. For example, it may well happen that a mixture 
of two different homogeneous samples is amenable to graduation 
by a homotypic curve. But even then if the given curve can be 
split up into simpler components we get direct evidence of hetero- 
geneity. . a 
II. ‘Thus our second condition is that the sampled frequency 
curve should not be capable of being analysed® into simpler real® 
components. 

Pearson’ has furnished us with a technical method for dissec- 
tion into two components. But failure in dissection may also 
imply that the curve is multi-complex in character, i.e. that it is 
built up of more than two simple components. This second 
condition (impossibility of analysis) again though necessary, is yet 
not sufficient. 

The concept of functional equivalency provides us with 
another test. If we consider any sub-sample' it should be gener- 
ally equivalent to another sub-sample, that is, it should not differ 
significantly from other sub-samples. Thus we get :— 

Ill. Lhe frequency constants of different sub-samples should 
agree within the limits of their own probable error.’ 


! We assume throughout that all samples are random samples, that is, we 
definitely exclude heterogeneity due to mere ‘‘ bias "’ in sampling. ' 

* Homotypic curves will ordinarily include the Gaussian and the different 
Pearsonian skew curves. Other smooth curves (Edgeworth, Charlier, Thiele 
ICapteyn etc,) may also be included. . 

® The possibility of suitable graduation of the present material has been 
discussed in Section IV,' pp. 35-40. 

* The original memoir was given in Phil. Mag. 1900, pp. 157-175. Fora 
discussion of its use in testing goodness of fit see L. Isserlis: ‘‘On the Represen= 
tation of Statistical Data.’ Biometrika, Vol. XI (1917), pp. 418 —425. 

» The possibility of dissection of the present material has been investigated in 
Section VI. . 

_ ® Negative and imaginary solutions are sometimes obtained; until we can 
give a consistent interpretation of these, it is perhaps safer to ignore such purely 
mathematical solutions. 

7 Memoir on Dissection of Curves, already cited Phil. Trans. Roy. Soc., — 
185A (1894). 

® Strictly speaking, the agreement of subsamples is only an indirect test of 
homogeneity. What it actually does serve to show is the representative character 
of the given sample. 
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This condition ensures that the sub-samples will not differ 
significantly from the general sample.’ 

The-above three tests are purely formal and have no reference 
to the nature of the material. We can proceed further by taking 
into consideration our previous experience of similar material. 

Let us take the case of stature as an example. In all known 
cases stature distribution is either approximately Gaussian or is of 
Tyne IV or Type I. Consider the frequency distribution of some 
unknown sample. If we find that the curve though homotypic is 
J or U shaped, we are naturally suspicious about the homogeneity 
of the material. The curve may be smooth, it may successfully 
resist dissection, its sub-samples may agree quite well, yet in view 
of our previous experience we would, in the absence of other 
evidence, hesitate to call it homogeneous. 

IV. Our fourth criterion is that the general nature of the 
sampled frequency should be the same as that of known homogeneous 
materval. 

This criterion is quite empirical in character and its practical 
utility depends upon what exact significance we can attach to the 
concept of ‘‘ general nature of known frequency constants.’’ 
Though somewhat vague this condition is by no means useless. 


Let us suppose that the given sample is really heterogeneous in character. 
Consider a ‘‘ random”’ subsample of the given sample. Now if this subsample ts 
to be representative in character, it must include the same degree of heterogeneity 
as is present in the sample itself, that is, in order that tt may be a “‘fair”’ as well as 
a “random” subsample, it is necessary that it should be sufficiently large. 
Samples which are large enough to be “‘fair’’ will obviously agree among them- 
selves. Thus the agreement of large fair subsamples cannot reveal the want of 
homogeneity of the given sample. 

Now consider a subsample which 1s again ‘‘ random”’ but which is not suffici- 
ently large to include the same degree of heterogeneity as is present in the sample. 
Not being representative in character, it will not be surprising if these fail to 
agree. Thus want of agreement on the part of subsamples on account of 
their smallness of size will not necessarily prove the existence of heterogeneity in 
the material. The lower limit of agreement of random subsamples may however 
be locked upon as a measure of homogeneity. 

In any case however, agreement of random subsamples does show that these 
subsamples are large enough to be representative incharacter. The given sample, 
being larger than its own subsamples, will obviously be large enough to be 
representative in character. Thus the agreement of subsamples is a test of the 

“srepresentative character of the sample, rather than any evidence of the homo- 
geneity of the material. | 

An example may help. Consider an ordinary black and white chess board. 
Let us look at this chessboard through a sighting hole. The size of this sighting 
hole determines the size of the sample. If this size is larger than the size of one 
of the squares then each sample will show a mixed patch. In this case subsamples 
would agree. On the other hand, if the size of the sighting hole is only a fraction 
of the size of a square, then some samples will show white, some black and others 
mixed patches. The lower limit, up to which samples agree is evidently a 
measure of the size of the discontinuities. Agreement of subsamples of 100 shows 
that 200 is large enough to be representative in character in the present case. 

i This implication serves as the basis of Pearson’s discussion of P.E. of 
sub-samples for comparison with the general sample. K. Pearson: ‘‘ Note on 
the Significant or Non-significant Character of a sub-Sample drawn from a 
Sample.’’ Bicmetrika Vol. 5 (1906), pp. 181—183. 
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We require some further precise quantitative test. This is 
supplied by the variability (both absolute, as measured by the 
Standard Deviation and relative, as measured by the Coefficient 
of Variation) of the distribution. ' 

V. The variability of the sample should not be significantly 
greater than the average variability of the same organ for known 
homogeneous materval. 

The Coefficient of Variation, V (multiplication by 100 is merely 
for arithmetical convenience) is a straightforward measure of 
variability. It is of course possible to set up other standards by 


choosing some other function of the S.D. and Mean, i( =) > but a6 


is quite unnecessary to enter into such subtleties in the present stage 
of our knowledge. 

It is quite easy to extend the above condition to the case of 
more than one organ. In that case we shall have to define va- 
riability by the generalised® or multiple probable error of the group 
of organs considered.’ ; 

We have thus got five different tests of ‘‘homogeneity.’’ It 
should be remembered that we have all along discussed statistical 
homogeneity. Whether statistical homogeneity necessarily implies 
anthropological homogeneity and vice versa, is a very difficult 
question,* into which I do not propose to enter. I confine - 
myself to a consideration of pureiy statistical homogeneity. 


{ For a full discussion see Pearson: Chances of Death ‘‘ Variation in Man 
and Woman,”’ pp. 255—377, specially pp. 272—286. Also Appendix I. 

2 K. Pearson and Alice Lee: ‘‘On the Generalised Probable Error in Mul- 
tiple Normal Correlation,’’ Biom. Vol. 6 (1908), pp. 59 - 68. 

® Incidentally we may note that variability gives us a convenient method of 
defining a ‘‘normal”’ group (in a medical, psychological or social sense) of indivi- 
duals. The normal group (with reference to some particular trait) consists of the 
individuals included between the Mean, ¥, and p times the S.D. o, where p is an 
arbitrary number. Thus a ‘normal ”’ individual is one who does not differ from 
the average type of his class by more than p.o¢. By a proper choice of ~ we can 
make our definition as elastic or as stringent as we please. Wecan also extend 
the definition to cover more than one single trait, with the help of the generalised 
or multiple probable error. 

* K. Pearson: * Craniological Notes. Homogeneity and Heterogeneity in 
Collections of Crania,"’ Biom. Vol. 2 (1903), pp. 345—347- Also see C. Myer’s 
Reply to above and Pearson’s Remarks on the Reply, Biom. Vol. 2 (1903), pp. 
504--508, and Aurel Von Térok’s Note and Pearson’s Reply. /b¢d., pp. 508—510. 


SECTION IV. TYPE OF CURVE AND ‘“‘GOODNESS 
OF FIT”. 


We shall now test the “‘ goodness of fit’’ with our ‘‘ normal ”’ 
curve. K. Pearson! has shown how this may be done. He shows 


that? if 
w=S (m,’ —m)? 
m ) 


where S denotes a summation, m’ and m are observed and 


theoretical values in each sub-group, then the chances of a 


system of errors with as great or greater frequency than that 
denoted by x” is given 


~D 
a -1g2 a age 1-3 
2 hg Big 38, Bo Ha? m8 x" 
ae jax fis yee 
eT | T iby Sade t240 Les RH BM, 13} 
io) ~ 


Tables? have been calculated to facilitate calculation of P 
when x” is known. . 
Pearson then shows * that if x* for the sample is so small as to 


warrant us in speaking of the frequency distribution as a random 
a 


! K. Pearson: ‘‘On the Criterion that a Given System of Deviations from 
the Probable in the Case of a Correlated System of Variables is such that it can be 
reasonably supposed to have arisen from Random Sampling.’’ Phil. Mag. July 
1900, p. 157. 

2 x2 is thus quite easy to calculate; it is given by 

3 square of difference of theoretical and observed values 
a eae Na —— reer). 


theoretical value of frequency 
¢ W. Palin Elderton: ‘‘ Tables for Testing Goodness of Fit.’’ Bzom. Vol 
I (1902), pp. 155—163. Reprinted as Table XII on p. 26 of Tables for Statisti- 
cians, etc. 


4 Pearson, paragraph 5 and following of reference t. 
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variation of the frequency distribution determined from itself, then 
we may also speak of it as a random sample from a general popula- 
tion whose theoretical distribution differs only by quantities of the 
order of the probable errors of the constants from the distribution 
deduced from the observed sample. 

Thus if a curve ts a good fit to a sample, to the same fineness 
of grouping it may be used to describe other samples from the same 
population. If a curve serves to any degree, it will serve for all 
rougher degrees, but it does not follow that it will suffice for 
still finer groupings. A good fit for a large sample would be a 
good fit for a smaller sample but not necessarily for a larger 
one. 

I shall test the Goodness of Fit for different groupings. 
I shall next compare the fit for the same grouping given by 
the slightly different values of the Standard Deviation calculated 
with different unit of grouping. ‘This will test how the Goodness 
of Fit is affected by different units of grouping adopted in calculat- 
ing the frequency constants. 


NORMAL CURVE. 


I have calculated the theoretical frequencies from the ‘‘ raw ’” 
(i.e. uncorrected by Sheppard’s adjustment) values of the S.D. 
in some cases. For ‘‘if the ordinates of a normal curve be 
calculated from the raw second moment value of the Standard 
Deviation, these ordinates will more closely represent the actual 
frequencies than do the ordinates of the true normal curve, which 
have to be corrected by the factor | 


2 2 
iets xy" — 0 
2 


I 
o 


to obtain the actual frequencies.’’ 

If therefore our sole object is to compare observed and cal- 
culated frequencies for definite series of groups, there are advant- 
ages in using the “raw’”’ second moment in the equation to 
the curve. Such acurve has been termed by Sheppard a “ spurious 
curve of frequency ’’®. 


! For a discussion of another test of Goodness of Fit proposed by Prof. 


Edgeworth see a Note by L. Isserliss: ‘‘On the Representation of Statistical 
Data" Biometrika 1917, pp. 418—425. 
* Editorial Note: ‘‘On an Elementary Proof of Sheppard's Formulae 


for correcting Raw Moments and on other Allied Points,” Biom. Vol. 3 (1904), 
p- 311. 
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’ TABLE 2. 
Mean = 1656'2938 mm. 
Unit =20 mm. 
8.D.= 67°3845 mm. 1/¢ ="2960802. 
Observed Theoretical (m' —m)? 
Stature. Value Value (m’ —m) ——— 
| m’. m. a 
Beyond 1475 | 3 P00} (22 1°897 3°205 
1475—1495 I 1°37 34 2373 MOt 
—I515 4 260),51 1°338 673 
—1535 | 2 4°72 29 2722 1°569 
—-1555 4 7°68 67 3686 1°768 
SerS75 10 {1°45 71 1°457 “185 
—I595 12 15°64 81 3°684 "850 
—I1615 25 19°58 25 5°418 1*499 
—1635 32 22°45 35 9°546 4°058 
-—1655 21 23°58 87 2°588 "284 
—1675 17 22°71 O4 5°710 1°436 
—1695 25-5 20°23 05 1°269 ‘7960 
—I1715 18°5 16°18 89 P2530 I i530 
= 39-35 10 11°98 74 1°987 "329 
Tue 5 8°13 40 3°134 E209 
cet Ww Se, 10 6°20 48 3°795 2°321 
—1795 Zz 1a oe 0°267 "O41 
—I1815 fe) Eno 37 1°504 1*504 
Beyond 1825 2 E22) QO} 0.770 "482 
200 200°I9Q 75 #2=22'699 


The above table gives observed and theoretical values for 20 
mm. grouping. These have been plotted both in histogram and 


in mid-ordinate continuous curve form. (See Plate I). 


The equation to the theoretical Gaussian is (in 20 mm. work- 


ing units) :— 
i . poke 2 
Y= 25°082 x» Exp: aa Coie) 
30°3259 
where A =stature im mm. 


Y =frequency. 


Mean=16 56°29 38 mm. 
Doe 07°38" "4G: 8): ming, 


Unit of grouping= 20 mm. 


In order to avoid fractions of individuals in theoretical values 


we stop at 1475 mm. and 1825 mm. 


with PaeEQ ty x’ = 22'699 
From Table XII, p. 26 we find 
for VS 22 P= 25, 1G135 
x” = 23 "IQ 05 90 
"04 13 95 
for x’ = 22°699 P=:23 19 85— ‘699 x (04 13 95) 


Thus P=="2030, 
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We can now find the probable error of P. Pearson! has ~ 


shown that 
en 37 y2 Es (x”) — Po - s(x”) } 
and xq = {2(9-1) + 9/H + g(g—1)/N} 
where g=number of cells and H=harmonic mean of expected 
frequency. 
In the present case, g=19, N=200, g/H =4:4137. 


Hence, 2 =42°1237. Giving $o,2=3'245 
also Pi, = *2030 and P\,="1226 
thus gd, =0°2600. | 


we get finally, P ='2030+'1760. 


The chances are 4 to I against its being a random sample. 
In other words about once in five trials we would get worse 
fits than this. The probable error of P is large. Still the fit is 
not very bad, for odds of 4 to I cannot be considered excessive. 

We notice that the contributions of the terminal ranges to 
x’ is heavy, being 3°265, 17504 and ‘482. Combining the two 
terminal groups at each end we find x*=18'482, and n’ 17. We 
get P="2978 which gives a decent fit. In three trials out of ten, 
random sampling would give us worse fits. 


TABLE 3. 
Mean =1656'25 mm. Unit of grouping =50 mm. 
S.D. = 67°3849 mm. 
) ad td 
| Observed — Theoretical , 
Statureinmm. | Value | Value (m’—m). (en 
| m’, | m. 
Jes ar aia eee 
Beyond 1530 | 8 | 6°0993 1*9007 "5902 
1530—1580 | 14 | 19'6830 5°68 30 1°6408 
—1630 | 45 | 43°9045 1*0955 "0231 
—1680 | 60 | 57°8639 2°1361 0788 
1730.6) 5) 48") 1s econ 29259 "1800 
—1780 | 20 ) 20°7464 0°7464 0268 
Beyond 1780 | 5 6'6292 1°6292 *4002 
200 2000004 n'=7 a2 = 2°8 399 


! 


From Tables by interpolation, we get 
P=0°82 65 83+°28 86 86 | 
the probable error, is large, but a high value of P is not improbable. 
_ The fit is now excellent. In 83 trials out of I0o0 the fit 
will be worse than this. We conclude therefore that with 50 mm. 
grouping, the Gaussian curve ts quite adequate for purposes of gra- 
duation. With this untt of grouping we may then safely investi- 


' Phil. Mag. Vol. LU, 1916, pp. 369-378...... 
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gate the statistical properties of the general population.' In 
subsequent analysis we have for this reason always adopted 50 
mm. as our unit. With finer groupings we are likely to obtain 
mere individual peculiarities of our sample which may not have 
any connexion whatever with the properties of the general popu- 
lation. 

We shall try the effect of other values of Mean and $.D. on 

‘* Goodness of Fit.’’ 


With 20 mm., M=16 56'2938, S.D.=67°13 25- 
n’=19 x°=25'59 42, te ‘t0 98 SE 
Only once in ten trials the fit will be worse. The end con- 


tributions being rather large, we again combine the terminal 
frequencies and obtain a much better fit. 


n’=17, x*=21'20 72, P= ‘1712 
That is once in six trials we wiil get a worse fit. 


TABLE 4. 


2) 


Summary of ‘‘ Goodness of Fit. 


Mean. Sep: n’. xe, Jee 


| 
Unit of grouping =20 mm. 


1656-29 38mm,| 67°13 15 Ig yak 25" soa ‘10 98 81 
17 21°20 73 By Age 
1656°29 38 67°38 49 8 19 | 22°69 9 ‘20 30 09 


17 | 18°48 2 ZOrTe, 7a 


Unit of grouping =50 mm. 


16 56°25 mm. 69°00 | y 3°47 75. OF. 2A: 
16 56°51 67-2195 7 2°93 82 *81 56 98 
16 56°25 67 47 5 | 7 3°02 O9 "80 65 85 

fh 2°77 88 | "83 33 98 


16 56°25 67°38 49 8 | | 


20 mm. gives a fit of about the same order in each case. 
Even with such fine grouping, we get an indication that Gaussian 
distribution is not impossible, but we cannot assert that the normal 
curve is fully adequate. 

With 50 mm., the fit is excellent in every case. Even with 
the highest observed value of S.D., namely 69°00 mm., we 


1 This is the reason why 50 mm. is selected as our standard unit of grouping. 
For purposes of comparison. See page 21. 


v 
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get P greater than ‘75, i.e. in three cases out of four, a random 
fit will be worse. Thus we see that the effect of different units 
of grouping (in calculating moment coeffictents) on the Goodness of 
Fitts negligible. | 

We must note however that the Goodness of Fit is a much 
more sensitive criterion than P.F. in judging the accuracy of aS.D. 
We notice that with S.D.= 67°385 (the value finally adopted) P is ‘83, 
which is substantially better than P= ‘75 with S.D.=69:00 mm. 

We conclude that with 50 mm. unit of grouping, a Gaussian 
curve ts fully adequate in every way. | 


NoTE ON THE LIMITs OF THE UNIT oF GROUPING. 


In section I we saw that up to a certain unit of grouping 
which in our case was 50 mm., the effect of grouping on the 
frequency constants were negligible. Let this upper limit of 
grouping be h,,. On the other hand, in the present section, we 
have seen that there is a lowey limit of grouping for which the 
goodness of fit is Satisfactory. Let this lower limit be h,. In our 
case, it is again 50 mm, 3 

Evidently, the size of h,, and h,, both depend on the size 
of the sample. If the distribution is truly Gaussian, then these 
should depend only on the size of the sample and the S.D. It 
will be extremely useful to obtain even a rough idea about h, and 
h, for any given size of sample. 

We can study the problem empirically. We must remember 
Bernouilli’s law which requires that accuracy should depend on 
the square root of the total number of measurements. As the 
simplest alternative we can try, if N is the total size of sample and 
A and B are constants, 


Vim = AN and h,= B/N 


In our case we have, h,=50 mm. and f= 50mm. Substitut- 
ing, we get 


A=50//200= 3°53 55 
B=50.4/200=707'10 68 


I provisionally suggest that 

(a) In the case of Stature, in calculating frequency constants , 
the unit of grouping should be less than 35VN. | 

(6) In testing goodness of fit, the unit of grouping should be 
greater than 700/V/N, | 

I do not of course attach much value to the numerical 
magnitudes of A and B given here; study of a single example is 
obviously not sufficient. I give the above analysis as a suggestion. 


_. | This result is well brought out in the 50 mm. graph, but it is quite impos- 
sible to judge the goodness of fit by merely looking at a curve. 
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Adopting the above values of A and B, we get the following 
table :— N he i 
IO 47 222 
20 16 157 
50 25 100 
Ioo 35 70 
- 200 50 : 50 
500 80 30 
1000 IIo 20 


With small samples of 10, #,, is 11. Grouping for calculation 
of frequency constants is thus justified even in the case of small 
samples. On the other hand for N=10, h, is over 200 mm. which 
shows the absolute impossibility of judging the adequacy of fit 
in the case of small samples. In fact with samples of less than 50 
(for which 4,=100 mm.) it is practically impossible to test the 
goodness of fit and hence to judge the reliability of any inference 
about the general population. Even with N=—r1o000, the lower 
limit is not reduced below 20 mm. “Thus, discontinuities of less 
than 20 mm. may easily escape in samples of 1000. 

It should be observed that so long as hf, is greater than h,,, 
we Cannot hope to attain great accuracy in judging the significance 
of a fit so far as the general population is concerned. We see, 
however, that with samples of 200, 4,=h,=50 mm. It then 
becomes only just possible to assert anything about the population 
sampled with any certainty. It seems as if 200 is the lower limit — 
of safe sampling for anthropological purposes (at least so far as 
stature is concerned). 


TYPE IV. SKEWNESS, LEPTO-KURTOSIS. 
For Anglo-Indian Stature, our fundamental constants are (in 
50 mm. working units). 


Mean=16 56°79, +3'2I1 36 mm. 


S:D.= 67°38 49 842.55 85 mm. 
— 4°06 720 + 
a. = 706 87 564 07 97 81 
B, = 300040 Foe 00-17 
Se +°I10 53. + ‘05 68 
= 709 63» «=6©+4°78 18 mm. 
pe = I°8rt 62 94+ ‘12 24 QI 
ac —'64 18 53+ ‘28 60 80 
f= 19°56: 54 031°81- Or 17 
a= 86 66 61 (!) 


1 From Biometric Table XLII (a), p. 78. 
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The curve is not significantly skew. But there is distinct © 


tendency towards lepto-kurtosis. 

The curve belongs to Type IV of Pearson’s Skew Curves.! 
The probable errors of 8; and B, are quite large, and we may 
investigate whether the 8,—£, “ probability ellipse ’’’ touches the 
Gaussian point G.* 

In order to do this we must find =, and *,, the semi-minor 
and semi-major axis of the ‘‘ probability ellipse.’’ 


B,= °06 87 56 
B2=3°5 U177y N.2,=1'4+°37 912 x [-4]=1'551648 
1°6885 
= Btls I5+° ee 5 |= ee 
3 2 5 +°37 912°x [75] er 
B,=3'50 46, 1'177*/N3,= 155 79 gr | 
Similarly 177 / N3,=13'51 TE 


Multiplying by x, ='04769, we get 
semi-minor axis =°0743 
semi-major axis=°6446 


Tracing a probability ellipse with these values and centering 
the ellipse at the point £,=-‘07 and P= 3°5 approximately, on 


the diagram on p. 66 of Biometric Tables, we find that the Gaussian 


point G falls just within the ellipse. We also note that the ellipse 
covers a small area of the Type ITI region. 

We conclude therefore that a Gaussian distribution itself 
is not unlikely and may be expected to give a good fit. Type 
III is not altogether’ impossible but as the major portion of the 
ellipse lies within the Type IV region, the lepto-kurtosis is prob- 
ably just significant.’ 


COMPARATIVE DATA. 


Our frequency curve is approximately Gaussian in type’ 
The asymmetry is very slight, skewness is small and positive 
(Mode is greater than the Mean) and the curve belongs to Type IV 
with lepto-kurtosis. 

A. O. Powys* has discussed distribution of stature for 
different age groups of New South Wales criminals. The atithor 
says, “‘by looking at the curves, we see that the material is 
extremely homogeneous® .... the stature distribution of these 


! See Memoirs cited above in footnote on p. 16. . 

® A discussion of these points is given by A. Rhind; ‘ Additional Tables 
and Diagram for the Determination of the Errors of Type of Frequency Distribu- 
tion.” Biometrika Vol. 7 (1910), Pp. 380—397. 

* The asymmetry is very slight and the distance between the Mode and 
the Mean is also quite small. On the whole there is very little to choose between 
the “ normal” and a Type lV curve. The latter may give slightly improved fit. 

* A. O. Powys: ‘ Anthropometric Data from Australia,’ Biometrika Vol. 
| (1902), p. 30. 

6 Ibid., p. 38. 


——— a 
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homogeneous groups is nearly normal, but what divergence 
there is lies in the direction of Type IV.’’! In the case of 
males, the skewness is always positive and the Mode is greater 
than the Mean.* Powys used very long series of measure- 
ments extending to several thousands in each age group. The 
distribution is lepto-kurtic in every case. 

W.R. Macdonell® finds in the case of 3,000 English convicts 
the stature curve to be of Type IV. The skewness is small and 
negative and there is slight lepto-kurtosis. Mode is-less than the 
Mean. | 

In the case of Verona statistics* the stature of 16,203 con- 
scripts show significant lepto-kurtosis® and a Type IV _ distri- 
bution while 3,810 selected recruits show equally significant 
“ platy-kurtosis.’?* Both have significant positive asymmetry 
and Mode is greater than the Mean. 

J. F. Tocher’ finds lepto-kurtosis for the Scottish Insane, 
the curve belongs to Type IV, and there is small positive skew- 
ness, Mode being greater than the Mean*. For long series then, 
viz. New South Wales males, Italian conscripts, Italian recruits 
and Scottish Insane, there is agreement as to skewness—in all four 
cases it is significantly positive ; in one case, the American recruits,® 
there is quite significant negative asymmetry.’’ American recruits 
also differ in showing meso-kurtosis.!° 

Charles Goring !! in the case of the English convict found 
the distribution approximately Gaussianin type for all crime- 
groups excepting one. In the only case in which the distri- 
bution is significantly different from the normal, the curve is 
of Type IV with significant lepto-kurtosis and marked positive 
skewness. 

Orensteen ” found in the case of Cairo-born Egyptians, that 
the distribution was nearly symmetrical. The criterion K how- 
ever is less than 1, hence the curve really belongs to Type IV. 


! Ibid., p. 30. 

2 Ibid., p. 43. Powys mentions skewness as negative. This is probably a slip. 

© W. R. Macdonell: ‘‘On Criminal Anthropometry and the Identification 
of Criminals,’’ Biom. Vol. 1 (1G02), pp. 177—227. 

+ Quoted in Miscellena, Biom. Vol. 4 (1906), p. 506 and referred to by 
Jj. F. Tocher (see below). ' 

5 Lepto-kurtic curve are more sharp-topped than the normal, cu ve, the 
rise being sharper than the Gaussian. 

* Platy-kurtic is ‘‘ flat-topped ’’ as compared to the Gaussian. 

7 J. F. Tocher : ‘“ Anthropometric characteristics of the Inmates of Asylums 
in Scotland,’’ Brom. Vol. 5 (1917), pp. 301. 

8 Ibid., p. 182. Tocher says that for long series asymmetry is negative. 
He evidently means «#3. This however is slightly ambiguous and may give rise 
.to confusion. I have thought it better to refer to Skewness in each case, which 
has its sign opposite to that of u3, so that Mode is greater or less than the Mean 
according as skewness in positive or negative (and ug negative or positive). 

9 K. Pearson, Phil. Trans. Roy. Soc., Vol. 186 A (1894), p. 385 

'0 Meso-kurtosis signifies about the same degree of flatness as the Gaussian. 

!t Charles Goring: ‘‘ The English Convict,"’ p. 199. 

12 Myer M. Orensteen : ‘‘ Correlation if Anthropometrical Measurements in 
Cairo-born Natives,’’ Biom. Vol. XI (1915), p. 71. 
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Conclusion. 


(rt) The Gaussian curve is quite adequate for graduating a 
short series of 200 Anglo-Indian measurements. This confirms 
C. D. Fawcett’s rule of normal distribution for short series! of 
anthropometic measurements. | 

(2) There is some tendency towards Type J V, with lepto- 
kurtosis. All long series, with the exception of American and 
Italian recruits seem to be definitely lepto-kurtic, It is therefore 
likely that stature distribution is in general slightly lepto-kurtic 
in character, but this small lepto-kurtosis does not become statis- 
tically significant in small samples. 

(3) Skewness is small and positive (Mode being greater than the 
Mean) for New South Wales criminals, Italian conscripts, Italian 
recruits, Scottish insane and a short series of several offenders 
among English criminals. It is negative in the case of several 
Short series of English criminals, and for one long series viz. 
American recruits. For a short series of Anglo-Indians it is 
Positive but is so small that it cannot be cailed significant. 
Hence we conclude that the small skewness of our present sample 
is not incompatible with homogeneity. 

(4) We conclude therefore that the distribution of stature in the 
casé of Anglo-Indians is of the same nature as in the case of other 
samples where the material is known to be © homogeneous.’’ In 
other words, the nature of distribution of stature does not reveal 
any presence of heterogeneity in the Anglo-Indian population.* 


| Biometrika Vol. 1 (1902), p. 443. 

2 Type IV of course is absolutely no indication against homogeneity. Fora 
detailed discussion of this point see K, Pearson : “Skew Variation, a Rejoinder,”’ 
Biometrika Vol. 4 (1905), p. 181. 


SECTION V. DISSECTION INTO COMPONENT CURVES. 


I shall next consider the possibility of.statistical dissection of 
our frequency curve. It might be possible that the sample con- 
sisted of two (statistically) different strains. If this were so then it 
would be possible to break up the frequency distribution into two 
component normal distributions. 

The fundamental memoir on this subject is K. Pearson: ‘‘ On 
the Dissection of Asymmetrical Frequency Curves.’’'! Pearson has 
discussed the application of the theory in several? actual cases 
and® has given the fundamental equations in a somewhat. better 
form in a paper ‘“‘On the Problem of Sexing Osteometric Mate- 
rial”.* I have followed the notation of the fundamental memoir, 
excepting in one or two instances, where I have used a slightly 
modified notation. : 

‘But before proceeding to a full discussion of the subject it 
will be useful to apply some simpler tests of homogeneity. 


AGREEMENT OF SUB-SAMPLES. 


The whole group of two hundred cards were arbitrarily 
divided into two sub-groups of 100 cards each. The Frequency 
Constants were calculated for each of these two sub-groups and 
compared. 


The unit of grouping adopted was 50 mm. in each case. 


Mean :— 


Ist group of 100 = 16 58°75+4:64 36 mm. 

2nd group of I00 = 16 57':00+4'94 14 

Difference = 1°75+6°78 08° 
Standard Deviation :— 

and group 726.) «.* Se 3°40 many, 

Ist group = 68°85 + 3°28 

Difference = 4°41 + 4°79 


1 Phil. Trans. Vol. 184A (1894), pp. 71—110. 

2 KK. Pearson: ‘‘On the Applications of the Theory of Chance to Racial 
Differentiation,” Phil. Mag. 1901, p. 110. 

3 K. Pearson: ‘‘On the Probability that two Independent Distributions of 
Frequency are really Samples of the Same Population, with Special Reference 
to Recent Work on the Identity of Trypanosome Strains.” Biometrika Vol. 
10 (1915), p. 123 ff. 

4 Biometrika Vol. 10 (1915), pp- 479—487. 

5 It is well known that the P.E. of a sum ora difference is given by square 
root of the sum of the squares of P. E. (see Yule Statistics, p. 211). 
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Coeff. of variation :— 


2nd group = 4°42 12.5 seer 13 
Ist group =—- 41504 +°19 83 
Difference = ‘2708 +:28 97 


The difference 1s in no case significant. 


Passing on to the other constants we get :— 


fy + 
Ist group = 214 66 51+'20 48 
2nd group = 1°89 60 75+°18 09 
Difference = 0°25 05 76+'27 33 
Pe - : 
Ist group — 0:28 75 92+°51 96 
2nd group — 1°29 90 10+°43 14 
Difference = rol 14 18+4°67 53 
iz Ist group —~ 12°26 96 0143°04 55 
2nd group = II'QI 27 23+2°32 19 
Difference = 0°35 68 78+ .38 30 
Bi 
Ist group = ‘08 357410 99 
and group = °24 890+°12 04 
Difference = “16 533416 30 
B.:— 
Ist group = 3°32 56+4°65 80 
2nd group = 2°66 27+°26 09 
Difference = ‘66 29+°70 78 
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We conclude that the first hundved measurements are not 
significantly differentiated from the second hundred in any way. 
Both represent ‘‘vandom”’ samples of the same general population. 

It should be noted however that the difference between the 
two samples of hundred each, is of the same order as the probable 
error of the difference. In one case viz. m3, the difference is 
actually greater than its probable error. This shows that 100 is 
very nearly approaching the critical limit of “‘ fair (i.e. representa- 
tive) sampling.’’ [See section III, footnote 8, pp. 32-33]. 

There ts grave danger of samples of less than one hundred being 
not representative in character (at least so far as the stature of 
populations of the same order of variability as the Anglo-Indian is 
concerned). The discussion on p. 40 Section IV. shows however 
that two hundred is about the lower limit for safe inferences about 
the general population. 
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TRIAL SOLUTIONS BY ‘* Tarr’? FUNCTIONS. 


Consider a mixture of two homogeneous components. If the 
Means of these components are sufficiently wide apart, the “ tail’’ 
(i.e. the terminal frequencies) on each side will represent an 
approximately homogeneous part of the component on that side. 
Or if the variability of one component is sufficiently greater than 
the other, the terminal frequencies on its own side will give a 
fairly homogeneous “‘tail,’’ even though the Means are not widely 
different. 

We can fit a normal (Gaussian) curve to the ‘“‘tail,’’ that 
is, to the terminal frequencies only, with the help of the 
“‘tail’’ functions. If the ‘‘ tail’’ is significantly different from 
the whole sample, then the Gaussian which describes the “‘ tail” 
satisfactorily may be quite different from the Gaussian which fits 
the whole sample. For example if we get two “tail ’’ distribu- 
tions which are each different from the whole distribution, and 
yet when added together reproduce the total distribution, then we 
are pretty certain that these “ tails’’ each represent one compo- 
nent of the given sample. Even when we find only one “ tail” 
which is different from the total distribution we can always find 
the other component by subtraction from the total curve. 

This method belongs to the trial and error type. The “‘ tail 
curves’ obtained by considering different portions of the tail, may 
themselves differ. The uncertainty in the terminal frequencies 
must be considerable and as Dr. Lee observes, “‘ the chief weakness 
of the method, besides the assumption of the Gaussian, often 
quite legitimate, is the absence as yet of the values of probable 
errors, which must be very considerable for slender material.’’! 

For the purposes of ‘‘tail’’ functions, 50mm. gives too 
broad groupings. Hence I have found it necessary to work with 
20mm. groupings. 

Curtailing at 1585, we get the following :— 


1585] 1505 1545 Be 25, cies F505 1485 1465 
Group -- |-1505 | -1545 —1525 —1505 —1485 -1465 -1445 Total. 
-| mm, 
| z | 
Frequency. IO 4 2 / 4 I | I 2 24 
| ; | 


Taking origin at end of range 1585, we get raw moments 
4, = @. = 2°20 83)-33 and vz = 8°66 66 67 
Hy =2"=3°78 99 31 


!_K. Pearson and Alice Lee: Generalised Probable Error in Multiple Normal 
Correlation. Biometrika Vol. 6 (1908), pp. 59-68. Alice I.ee: Table of the 
Gaussian Tail Functions. Biometrika Vol. 10 (1914), pp. 208-214; Biometric 
Tables. p. xxvii. 
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32 a) 
Hence. Datos mG 28 | 
From Biometric Tables XI, p. 25, we get | 
v, =0°69 28 . 4 ohh 
h'=0'77 7% 45 an 
=175 79 30 a) 
Thus o =). d=1°757030 x 2° 208333 ; 


ae 880107. 


Mean is at a distance beak’ = o outa (in working 
Origin. 
From Table II:— n/N = :21 85 68 5 
Thus we get a normal curve of Bh 
N = 110 individuals 
Mean = 1645°3 mm. 
5.D, = "+ 76am: | 
Curtailing at 1605 we get a fresh table :— 


1605 ' oy ea 
Group ++ +e | 1585 |-1565 |-1545|-1525|—-1505 | -1485 | —1465 | -12 
: inm. 


—— | | | 


Frequency.) .... 1) fe *wSg6 4 


—— es Ee es 


Calculating ‘‘ raw’? moments about end of stump ( 
we get ‘2 


v’=d=2°30 55 56 ve'=9°47 22 22 
giving corrected M3 =3°71 23 89 

: pe 
Thus Wee =o 64 45 31 

| 


From Phew Tables XI, p. 25, we get by inter D pe at 
HOOF AS Br > ae 


h’=0'44 33 Sn 

Wo=1'52 32 67 , % by aa 

Thus 7 =y'd=3'511747 (in working units) 

or 7 =70'°2349 mm. balers 


Mean is at distance 
h'o ="4433 * 70°2349 mm. from rie anm, 


Thus Mean = 163514 mm. on 
and — n|N="32 27 64 2 (from Table II). ues 
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We finally get the following for the shorter end of the frequency 
distribution 


N =I12 
Mean = 1635'r4 mim. 
Oe =e Fores rit. 


This gives a ‘‘shorter” group differing in the average stature 
but with about the same variability as the total sample. 

Let. us now turn to the Zaller end. 

Curtailing at 1705, we get ; 


1705 


Group - | -1725 |-1745 |-1765 | -1785-| -1805 |-1825 |-1845 |-1865 | Total. 


Frequency .. 


| 
18°5 10°O 5:00). LOTS 2° fe) re | 1 ie) 47°5 


With origin at 1705, raw moments are 


vy, = 2°00 | 
¥q = 60°73 42 11, leading to. ~,.=2°73 42 TI 
vw, =0'68 33 
Thus h’ =0'70 71 
wv, =1'68 18 
and we obtain 
DE 2-166 : 
Mean= 1659°02° mim. 
Sah =? 07 27 iin. 


which is practically identical with the whole sample. 


Thus the “ taller’’ end seems to represent a homogeneous sample 
of the whole group, and starting from the taller end, we do not succeed 
in breaking up the given frequency distribution into two normal sub- 
groups. 
; The ‘‘ shorter ’’ end gives apseudo-component. I shall show 
later on, when we consider the question of age-differentiation that 
the shorter tatl represents approximately the smaller age groups. 


ASYMMETRICAL DISSECTION. 


We have seen that our frequency curve is slightly asymmetric. 
_ As Pearson observes,! ‘‘ the asymmetry may arise from the fact 
that the units grouped together in the measured material are not 
really homogeneous. It may happen that we have a mixture of 
2, 3,---. m homogeneous groups, each of which deviates about its 


mean symmetrically and in a manner represented by the normal 
‘curve.’ 


1 Karl Pearson : ‘Contributions to the Mathematical Theory of Evolution 
I. On the Dissection of Asymmetrical Frequency Curves,” Phil. Trans. Roy- 
Soc., Vol. 185A, 1894, p. 72. 


s 
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Thus an asymmetrical frequency curve may be really built up 
of normal curves having parallel but not necessarily coincident 


axes and different parameters. The object of the present section 


is to discuss the possibility of splitting up our asymmetrical fre- 
quency curve into two component normal curves. | 

Pearson gave necessary mathematical formulae? for this pur- 
pose in his memoir of 1894. The solution depends on finding the 
roots of a numerical equation of the ninth degree, and the arith- 
metical calculations are extremely laborious. Pearson has dis- 
cussed the application of the theory in several actual cases.’ 

Let Pa, M3, #, and w; be the moment-coefficients, M the mean 
and N the total of the given frequency curve. Let m,, m,, be the 
means, 7), %, the standard deviations and u,, , the totals of the 
component curves. 


Then if / is the unit of grouping 
m=M+y,.h and m=M+y,.h 
Also, taking h=1, we have 
o\" = pb. — $H3/yo— bP iy, + D2 
O53) = Mo — $Ms/¥, — SP y2 + D2 


nN, = teh 
Vie Fe 
je aS 
Yi — Y2 
Let Pi=yit v2 bo=Y1-¥g and p;=),.p, 
Also Ay = 9/9” — 34, A; = 30 Moos — 3h 
Then ps= 2p.3 — 2uzh po— \sPe” oe Sushs® 


4é3' — dp, + 2p,° 
‘* Hence, so soon as p, is known, $,=,/f, can be found, and 
then y, and y, will be the roots of :— . 
v*—piy + pr=0 
The equation for finding #, is one of the ninth degree :— 
24p," — 28A,pz" + 36m" 2° —(24p3\, — LOA,”)f,§ — (148 ps"Ay — 205") Pos 
+ (288, — T2\ An, —Ay8)p,8 + (24mg; — 7g Ay )Pa” + 32Mg*\4hg — 24u3> =O 


| [bid., p.72. ‘‘ There are reasons, indeed, why the resolution into two is of 
special importance. A family probably breaks up into two species, rather than 
three or more, owing to the pressure at a given time of some particular form of 
natural selection .... Even where the heterogeneity may be three-fold or more, 
the dissection into two is likely to give us, at any rate, an approximation to the 
chief groups.” 

® The fundamental formulae have been expressed in a slightly modified form 
in terms of the B-constants in a recent paper ‘‘ On Sexing Osteometric Measure- 
ments.” Biometrika Vol. 10, 1915, pp. 479—487. 

8 K. Pearson: ‘‘On the Applications of the Theory of Chance to Racial 
Differentiations,’’ Phil. Mag. 1901, p. 110. 
_K. Pearson: ‘‘On the Probability that two Independent Distributions of 
Frequencies are really Samples of the Same Population, etc.,’’ Biometrika Vol. 
10, 1915, p. 123 eft Seq. 


— 


' 
é . 
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In our case -we have, for 50 mm. unit of grouping, 


Ho= So: Fo 42 
H3= — 0°47 78 43 77 
= “EQ 10 22 08 
a epee SES: 

Thus Ay=— 5197 91 20 55 


A;=— 2°75 83 07 24 
After some laborious arithmetical calculations! -we find the 
fundamental nonic :— 
p2* + 6°97 56 40 64P,7+ 0°34 24 99 30f2°+13°57 76 59 770," 
+7°82 66 28 24p,4+17°63 90 36 20f,°— £17 49 41 13p," 
—0°'40 73 08 98, — ool I9 04 62=0. 
I next form the nine Sturm’s auxiliary functions, retaining 
tour figures in the decimal. 
f(x) = QP2° + 48°82 95P,° +.2°05 50p,° + 67°88 83p,4 + 31°30 650," 
+ 52°91 71p2’— 2°34 ggp, —0°40 73 
/(*%)= —1°55 12p,'—o'1r 42p,5—6°03 46p,5— 4°34 8rp," LETS. 0502 
+091 38f,*+ 0°36 20f, +0°0I IQ 
[3(%)= —13°86 58,5 + 21°3152f,° — 4°68 37p,4— 36°21 80p,3— 54°86 28,” 
+ 2°28 60p; +0°40 73 
Ix(*4)= + 166 93p,5—0°54 78p,4—0'90 53p,3—9'22 89p," + 0°09 56, 
Peet O00. 15 = 
(x)= + 6°70 18),%+ 103° 8 45p,2— 38°61 84p,2—1°83 67p,+0'21 04 
fe(x) = —417°52 59p,2+ 160°89 11p,2+ 7°18 96f,--0°89 03 
f(*)= —2°48 49p,.2+ 0°01 94f.+ 0°01 64 
1(%) = —5°06 47p,—0°15 
j(%)=—O701 41 
We can now find the number of real roots from the changes 
of sign in the Sturm’s functions. 
+2 ) = 60 
f (x) + = i” 
f(x) 7c ce 
fo(%) = 
}3(%) Pe 
1,{%) 
f(x) AR 
T6(*) on = By 
f(x) = 
f(x) = ce 


f(x) ar E ‘ 


+ + 
+ 
de | 


| My best thanks are due to Prof. J. M. Base M.A., B.Sc. of the Mathematics 
Department of the Presidency College, Calcutta for his kind help in checking the 
arithmetic i in many places. 
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There are 3 changes of sign with x= +o , 4 changes with *=0 ~ 


and 6 changes with x=—o. Hence there is 4—3=I1 real positive 
root and 6—4=2 real negative roots. 

By trial I locate the positive root between o and 1, and the 
two negative roots between 0 and—I. 

I try the following successive approximations by Horner’s 
method. 


f(+0°2) = +*°0I 77 f(+0°I5) =—"03 79 
f( +018) =—‘oI 17 }( +.0°187) = —-00 09 
}( + 0°188) = +00 02 }( + 0°1878) = —-00 002 


Thus we can take the positive root, £;= +0°1878 

For the negative roots I try | 
#0) =—or 19 f(—°5) =-2°27 75 
/(—°25)=—°44 88 f(—‘o1)=— ‘oo 80 
f(-—‘I) = +°00 OI 


- Root is near —‘r. I try higher approximations, now retaining 
eight decimal figures. 


f(-"1) = +0°00 00 84 36 
f(-—"‘101) =. —0°00 02 54 15 
f(-—‘Ioo1) = + ‘00 00 51 06 
f(-—*1003) = — ‘00 00 44 78 
f(—"1002) = + ‘00 00 14 65 
Thus £2= —‘I002 is another root. 
Again 
f( 105)" | = 4-00 34 
/(--‘ot) = —-‘00 80 
{{ 63) = —‘oo 12 
{( —°04) —— a aia 00 ae | 
}(—"034) = —.00 00 97 79 
{(—"0343) = ‘00 00 17 84 
/(-"0344) = +°00 00 08 69 


Thus £,;= —‘0344 is the third root. 

It should be observed that if the material is a real mixture of 
two true normal components, then the mathematical solution 
would be theoretically wnique. In practice, however, a statistical 


curve may be the sum of two asymmetric curves, and hence we 


must not be surprised if more than one solution is given by the 
present method of dissection. Each root of the fundamental 
nonic gives one distinct mode of dissection. | 


Case I. 
pb,= + o18 78 
Then, $:=— 5°28 28 44 


P| =p3/p2= —28'II OI 59 


Pe Oe ne 


2... ee ee Om 
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Hence y, and yz are roots of 
7 y’ +2813 017+ 0°18 78=0 
We get 
¥,;= — 0°00 665 
‘Yg= — 2812 345 : 


We obtain, finally, for the first component, 


w=. 1. OA, 06, 23 
a = 139 31 
ery 28 E2345 
Mn  a8'Er 680. 
= 200°0473=200, to the nearest integer, 
and m,=1655'9I 75 mm. 


The second component is given by | 
oy = — 285°64 89 43 
Nz =— 0°04 73 
M,= 250°08 mm. 


The second component has o* negative, and is thus imaginary. 
Hence dissection into two veal components is impossible in this case. 
The first component, which is the only real component, gives 
practically the whole of the given sample. The total frequency 
of the second component is only — 04 73 and is quite negligible. 


Case 2. 
p3=— oO'lo 02 
We find | p= + I°2I 0g 58 36 
and p,=—12'08 54 13 
‘Thus _ yr +12°08 54 137—"I0 02=0 
and Y= + +00 82 8&5 
¥2= —12°IO Ig go 


We get for the first component, 


fey. > BOO SUSE: a: 
a= . 174 50 46 
o, = 31 94 87 


m, =1650°66 42 mm. 
The second component is 

N,= + 0°13 69 

o* = —29'03 96 23 7I 

M,=1051'98 mm. 


We again find that the first curve gives practically the whole 
of the given sample, while the second is imaginary. 
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Case 3. 
PP, = —0°03 44 
Whence bb. =. +017 19 70 
P} = 4°97 31 40 
Thus y+ 4°97 31 407—"03 44=0 
n= + 5c0 Ross 
ye = —4'98 00 475. 
Furst component 
Mean = 1656°59 54 mm. 
Myo = a ee 
tAey 1°76 6r og 
ai ee 1°32 89 50 
Second component 
Mean 1407°24 76 mm. 
Plz Oe AO Pa ae 
Og cw = | a PO"RO 08 Wap 
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The second component is veal this time, but its frequency 
being only 277, it is again negligible. 
practically the whole of the distribution. 

It will be seen that first solution (p,=°1878) gives the fre- 
quency curve as the difference of two normal curves. ‘‘* The prob- 
ability curve, with positive area, may possibly be looked upon as 
the birth population (unselectively diminished by death). The 
negative probability curve is a selective diminution of units 
about a certain mean; that mean may, perhaps be the average 
of the less fit.’’' In our present case, however, the negative 
component is imaginary. Hence we conclude that the real 
component is describing the general population with sufficient 
accuracy. 

In the case of the second solution (p,=—*‘1002) the second 
component, though now additive, is still imaginary. The mean 
is at 1051°98 mm. ‘This component may be interpreted as repre- 
senting a “‘tendency’’ towards the presence of a small propor- 
tion of dwarfs. 

This tendency becomes more prominent in the third solution 
(p,= —'0344). We find that the second component, whichis addi- 
tive and real, definitely represents a “ dwarf’’ distribution with 
an average stature of 1407'24 mm. ‘The proportion, however, is 
extremely small. It is only o'14% and can be safely neglected in 
samples of 200. In larger samples of over a thousand, we should 
not be surprised to get a few dwarfs. 


So far as the present analysis goes we must conclude therefore — 


that it is not possible to break up our given curve into two real 


! Pearson, Phil. Trans. Roy. Soc., Vol.185 A, 1804, p. 76. 


The first component gives 
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significant component distributions. The only sign of differen- 
tiation perceived so far is a tendency towards the presence of a 
very small proportion of dwarfs. 


SYMMETRICAL DISSECTION. 


We have already seen that 8, (which measures the deviation 
from symmetry) is not significantly different from zero in our 
present case. In other words, within the limits of probable 
errors it is quite possible to look upon our curve as a symmetrical 
one. ‘‘ Another important case of the dissection of a frequency 
curve can arise, when the frequency curve, without being asym- 
metrical, still consists of the sum or difference of two compo- 
nents, i.e. when the means about which the components groups 
are distributed are identical. This case is all the more the inter- 
esting and important, as it is not unlikely to occur in statistical 
investigations, and the symmetry of the frequency-curve is then 
in itself likely to lead the statistician to believe that he is dealing 
with an example of the normal frequency-curve.” ! 

Pearson also notes that ‘‘symmetry may arise in the case of 
compound frequency curves, even without identity of the means 
of the components. In this case, for two components, we should 
have for different means, equality of component group totals and 
their standard deviations. This equality seems less likely than 
equality of means and divergence of totals and standard devia- 
tions,” ? Maen 

Pearson then shows that for this second type of symmetrical 
dissection (i.e. divergent means) a necessary condition is that 3m,” 
should be greater than »,, that is 8, should be Jess than 3, or the 
curve should be platy-kurtic. But we have seen that our curve 
is lepto-kurtic (i.e. 3.” is Jess than #4), hence this type of dissec- 
tion is impossible in the present case. 

I shall now discuss the possibility of the first type of symme- 
tric dissection. ‘The fundamental equations are given in the 
Memoir cited, p. 90. I shall slightly modify these equations in 
order to express them in terms of the -variables. 

Let N, n,, ”,, represent the totals and 2, o, and ¢, the 
standard deviations of the compound and the two component 
curves respectively. Then, as Pearson has shown, the solution is 
given by 


i —- WwW w,— 
n= t5 SN i EO Ny 

W,— Wz W,— We 
Bi o,-=w, where w,= > 


and w, and w, are the roots of 
(144 — 3g” )W* + (HaHy — SP5)@ — (FH — SHaM4) =O 


! Karl Pearson: ‘‘On the Dissection of Symmetrical Frequency Curves,’ 
Phil. Trans. Roy Soc., Vol. 185A, 1894, p. go. 
2 [brd., footnote on pp. go-9gt 
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This equation involves ws. We can however transform this 
equation to the 8-variables. 
Dividing throughout by p#;*, we get 


2 
My wW T(Sry Me \ SMa Me 
#3). 42 (28 5 — so amt Moa =G¢ 
Be ees a Mg 15 Pe 


But pesky: and jue 
Changing to the #-variables and putting x=w/s, we get 
(6. ~3 + x 5B2—By Sree 3B, _ 


15 
Thus week 3(By— s-sb)aW HESS a) EM a's(By— 552)" 4 58,)? + 45(B,— — 3)(58% = 3B, = 36.) 
Pe ~2(B,—3) 


The condition for a real solution is that 


1(B,—5B:) > WV s5(By — 5Bo)® + (B2— 3582 — 3B) 


Squaring and substracting 


0 > 3'5(Bo— 3)(5B2" — 384) 


Pearson has shown that it is necessary that w, and w, should 


be of the same sign. 
. The necessary condition for real solution becomes :— 
For lepto-kurtic curves, f2—-3>0 or #,>3, it is necessary 
that 38, should be greater than 5f,”. 


For platy-kurtic curves, B,—-3<0 ie. B2<3, the condition 


is that 58,” must be greater than 3h,. 

: With ungrouped distribution it is almost impossible te fini B, 
directly. We can however find 4, in terms of 6, and £2, from 
Table XLII (b), p. 78 of Tables for Biometricians and Statisticians.! 


We have, B,= *06 87 56 
B,=3°50 46 
For B,=3°5 £,=23'°72 89+ 28 23° 5 [20142] = 25°11 37 
. 68 756 
4'0 BT;00% 55, ok ae = > x [10° 766] =28'40 23° 
B.=3'50 40'.. B= 25:11 374 “ [13°28 86] 


= 25°23 60 


We have f, greater than 3, and 3A, greater than 58,” hence 
we shall obtain a real solution, 
The quadratic is 


"50 46x*—1°54 26x +0'95 3I 27=0 


' Cf, K. Pearson: ‘* Skew Correlation and Non-Linear Regression’’, p. 8 
( Draper's Company Research Memoirs). 


Ee 
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The solution is given by 


W) = 2°19 75 
w,= “66 89 
Since f= SL 62 
We get o) = jE MOia# Spe “Br Ws 63 
= 7A Te. mim* = 44°89 mm. 
And Doser Z. 200 mao = x 200 
=I50°I1I = 49'89 


It is thus possible to break up the curve into two normal 
curves with the same Means but widely different Standard Devia- 
tions. It will be observed that nearly three-fourths of the sample 
has got a greater variability, while about one-fourth seems to be 
a very stringently selected group. This particular solution may 
be only a peculiarity of the sample and may have no reference to 
actual fact so far as the general population is concerned. A 
calculation of the probable error of 6, may throw some light or 
the question. 

Pearson’ gives the percentage variation of 6, to be 23°3 in a 
sample of 500. Multiplying this by 


A/ 500/200 = 4/2°5 ? 


we get the percentage variation in a sample of 200 to be 36 84. 
Hence the probable error in the present case is so large as +9°28. 


We thus have B,=25'236+9'28 


Ii we take our actual value of @,=3.5, the necessary condi- 
tion for a real solution is that 8, must be greater than 20°42. If 


the value of (, for the general population is less than 20°42 


{with a value of 8,=3'5) then the present method of dissection 
will fail. | 

This limiting value is only 4°82 less than the value of f, in 
the sample, while the probable error is+9°28. It is therefore 
not at all unlikely that 6, should be less than 20°42 in the general 
population. We conclude therefore that it is not unlikely that 
the possibility of this particular type of dissection is only a pecu- 


' liar property of the sample and has no reference to actual fact in 


the case of the general population. 

\ Hence we are not justified, on this evidence alone, in conclud- 

ing that the sampled population is heterogeneous in character. 
Note added on the 27th November, 1920. 


In view of the great importance of the question of hetero- 


geneity I thought it desirable to consider this question in greater 


ee ee ———$———_—__ — __- — = 


1 K. Pearson: ‘Skew Correlation and Non-Linear Regression’, p. 8 
(Draper’s Company Research Memoirs). 
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detail. I calculated the grouped moment-coefficients directly 
upto #, with 50 mm. as the unit of grouping. I find 


j—— 1°82 I0 42 
p3=— 0°47 78 43 77 
= * TI7O4..260 2208 
P= — 7°78 23 
He= 129°74 38 42 48 
Thus 

B,= 3601 

and C= 2 20 Age 


Since A; is greater than 3, it is necessary that 38, should be 

greater than 582. Actually we find 
38, =64°42 20 
while 5B,? = 64°83 60, so that 58.7 is > 3h, 

Thus no real solution is possible in this case. But we must 
note that there is some tendency towards a solution of this type. 
I do not propose to draw any inference from this result. I have 
not yet analysed the other frequency curves and so I am not in 
a position to either confirm or refute this ‘endency towards a very 
special type of splitting up.' 


Goodness of fit with Sum of Dissected Components. 


First component :— 
Mean stature=16 56°79 mm. 
S.D; =o = 7412 mm. 
Total=n, = I50 


Second component :— 
Mean stature=16 56°79 mm. 


S.D. =0, = 40°89 32 mm. 
Total=n; eT 
= ———= —— <= 
I au I +II (Total) mee Pree a 
First Second Theoretical eee Ct saan are 
Psy tggriay | Component. m’. m. | es 
6°53 48S 0.04 71 6°58 18 8 1°41 82 "30 55 
15°97 53 | = 1°46 22 17°43 75 14 3°43 75 ‘67 77 
31°31 52 | = 1"29 66 42°61 18 45 2°38 82 "13 07 
39°43 29 | 22°93 16 62°36 45 60 2°36 45 ‘08 96 
32°23 82 | 12"42 O1 44°06 43 48 3°33 57 ‘24 91 
17°26 74 1°77. 35 19°03 89 20 *96 II "04 85 
7°23 O61 "06 47 7°30 08 5 2°30 08 | "72 46 
J Re Re x2=2°22 57 


{ Since going to press, 1 have obtained expressions for the Probable Errors 
of the Component Frequency Constants, which confirms the non-significant 
character of the dissection in the present case. I hope to publish these new 
formulae for Probable Errors at an early date. 
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Thus, P=-89 46 80 
With the single normal curve, we had P=°82 65 83 


Difference =-06 80 97 


Thus there is an improvement of 8°2% in the fit. This is 
satisfactory. But, in view of the discussion of probable errors 
perhaps this is not sufficient to warrant us in asserting that the 
possibility of the present type of dissection is unmistakeable evi- 
dence of heterogeneity of the material. 


SECTION VI. DATA FOR COMPARISON. 
SOURCE OF THE MATERIAL. 


I have collected material from many different sources. In 
1897, K. Pearson’ gave the coeff. of variation for tooo English 
middleclass men, 390 Bavarian men, 284 French (from statistics 
given in ‘“Memoires de la Societe d’ Anthropologie de Paris,” 1888) 
and also some data for American school children (from the years 
6 to Io, taken from Porter’s ‘‘ Growth of Saint Louis Children ie 
I have retained his French and German data but have substituted 
corrected values for Englishmen given by Pearson in a later 
paper. I have omitted the children as being all under the age 
of Io. 

Pearson also reduced statistics for U.S.A. recruits* and gave 
final figures for his family data® in Biometrika in 1903. His 
family data consists of 1078 records of middle class English fathers 
and sons. 

Powys* gave the heights of 2862 male criminals from New 
South Wales, distributed into different age-groups. I have select- 
ed the total variability ° of the whole group, for in our Anglo- 
Indian data men of all ages are present. Powys considers his 
data to be “‘ extremely homogeneous.’”8 

In 1901, W. R. Macdonell’ discussed the measurements for 
3000 English criminals. He also calculated the coeff. of variation 
for 1000 Cambridge undergraduates.* 

Raymond Pearl*® has calculated variabilities of stature for 
416 Swedes, 475 Hessians, 266 Bohemians, and 365 Bavarians,'° 
The measurements were all taken on dead bodies and the coeff. of 
variation are 4°009 +094, 3°954+°I17, 4°323+°127 and 3'838 + °096 
respectively. 

Blakeman "! has analysed a short series of 117 English males 
who died in hospitals. The coeff. of variation? for stature is 


' IX. Pearson: ‘‘ Chances of Death,” Vol. 1, Pp. 294—200. 
> Phil. Trans. Roy. Soc., Vol. 184A, p» 386. 


° Biometrika Vol. 2 (1903), p. 370; K. Pearson and Alice Lee: ‘On the 
Laws of Inheritance in Man,” Pp-.357—482. 

* A. O. Powys: “ Anthropometric Data from Australia,’ Biometrika Vol. 1 
(1901), pp. 30-40. 

5 Ibid., p. 44. * Tbid., p. 38. 

7 W.R. Macdonell : “ On Criminal Anthropometry and the Identification 
of Criminals,’ Biometrika Vol. 1 (1901), pp. 177—277. 

8 Jbid., p: 189. 

® Raymond Pearl: ‘ Variation and Correlation for Brain Weight,”’ 


Biometrika Vol. 4 (1905), pp. 13-104. 

0 Jbid., p. 23. 

uy. Biakandies “A Study of the Biometric Constants of English Brain- 
Weights, and their Relationships to External Physical Measurements,” Biometrika 
Vol. 4 (1905), pp. 124-160. 

@ Jbid., p. 126. 
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4°55+°20., Blakeman believes’ the ‘‘increased variability in 
stature to be due to the measurements being taken on the corpse 
and not on the living subject.’’ He mentions further? that the 
average V for males in Pearl’s data is 4'1T. 

I have thought it best to omit the above series of corpse 
data for purposes of comparision. It will be observed that the 
variability is in each case considerably higher than the average. 
variability (which is about 3°6) obtained by omitting them. Thus 
the only effect of including the ‘‘ corpse’’ data would be to still 
further increase our average variability. Wemay further note that 
in most of the above cases, the variability is even higher than the 
variability of our Anglo-Indian data, which is about 4:06. Thus 
omission of the corpse date cannot affect our general conclusion 
that the variability of the Anglo-Indian series is not significantly 
greater than the average variability of stature for homogeneous 
material. . 

Tocher* gave in 1906, a very large series of measurements 


on the Scottish Insane, numbering 4381 males. 


Schuster * in 1910 gave V for different age-groups of Oxford 
undergraduates. For reasons already explained I have taken 
the average variability for the whole group of 959 individuals. 
In an editorial note to the above,’ some results for 493 Scottish 
(Aberdeen) undergraduates are quoted. I have calculated the 
coeff. of variability in this latter case also. I may note in passing 
that the different age-groups of the Oxford data do not give lower 
values of variability, in fact give slightly greater values than the 
total in many cases.°® 

Craig ' gave the results of a very large series of measurements 
of modern Egyptians. These were classified in accordance with 
the town or district of birth. The total number in each group 
is fairly large and this series gives us a very good list of variabi- 


lities for purposes of comparison. I have retained the separate 


variability for Aswan, omitting the total variability as the material 
is not homogeneous. 

Garett * has given a series of measurements of the natives of 
Borneo and Java. ‘The majority were coolies inthe employ of the 
author. Uniortunately the number in the case of each people is 
not very extensive, and I have been only able to retain the values 


l Jbid., p. 131. 2 Word py. 132, 

3 J. F. Tocher: ‘‘ The Anthropometric Characteristics of the Inmates of 
Asylums in Scotland.,” Biometrika Vol. 5 (1906), pp. 298-—350. 

4 E. Schuster: ‘‘ First Results from the Oxford Anthropometric Laboratory, ”’ 
Biometrika Vol. 8 (1911), pp. 40-51. 

6. Jord, -p. 49. 

§ Thus the lumping together of all age-groups cannot again affect the general 


’ validity of our conclusions. 


7 J. I. Craig “Anthropometry of Modern Egyptians,’ Biometrika Vol. 8 
(1911), pp. 69—77. 

8 fbid.. pt 75. 

9 T.R. H. Garett: ‘‘ Natives of the Eastern Portion of Borneo and Java,” 
Four. Roy. Anthrop. Inst., Vol. XLII, 1912, pp. 60—66. 
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for Javanese (17), Banjerese (33) and Sundanese (37), as no other 
series includes more than 7 individuals. . 

Joyce’ has given figures for 25 different groups of people of 
Chinese Turkestan and the Pamirs. But again the total number 
is rather small in most cases, even the longest series including only 
67 individuals. 

Leys and Joyce? gave measurements for 38 different. groups 
of people from East Africa. Some of these are foreigners. Num- 
bers are moderately large in some cases. the longest series contain- 
ing 384 individuals. 

Seligmann* has given measurements for 7 groups of people 
of Anglo-Egyptian Sudan. The number in each group is moder- 
ately large, being on an average about 50. Dr. Bowley has 
analysed the Dinka group containing 116 individuals. The abso- 
lute S.D. (9°66 mm.) as well as the coeff. of variation (5°4311) is ex- 
ceptionally high. Dr. Bowely * concludes from the goodness of fit 
that ‘‘ there is no indication of the mixture of two distinct groups 
with widely differing averages.’’ ° 

Frankly speaking, such a high value of V as 5°43114°24 for 
homogeneous material is extremely puzzling. We have of course 
obtained several high values of V, but in all such cases the num- 
bers are quite small and the P.E. quite large. One would like to 
obtain independent evidence regarding the homogeneity of the 
Dinka people. In any case, a fresh series of measurements of the 
Dinka people is urgently needed. 

Goring ® has given extensive data for English criminals, to 
which we shaJl have to refer again. 

Whiting’ has discussed the case of 500 English convicts be- 
longing to Dr. Goring’s data. 

Orensteen * gave results for 802 adult male Egyptians born in 


Cairo. 
Addendum. 


Dudley Buxton has recently published the Variabilities of 10 
Mediterranean and 3 Jewish races.° 


| T. A. Joyce: ** Notes on the Physical Anthropology of Chinese Turkestan 
and the Pamir,"’ Four, Roy. Anthrop. /nst., Vol. XLII, 1912, p- 450. 

2 Norman M. Leys and T. A. Joyce: ‘* Note ona series of Physical Mea- 
surements from East Africa,’ Four. Roy. Anthrop. Inst. Vol. XLII, 1913, 


p- 195- 


* C. G. Seligmann: ‘“ Some Aspects of Hamitic Problem in the Anglo- 
Egyptian Sudan,” Your. Roy. Anthrop. Inst. Vol. XA.11L, 1913, Pp. 592-705. 
* Thid., Pp- 705- ; 


® In the absence of any attempt at statistical dissection, mere homotyposis in 
graduation cannot be considered conclusive evidence of homogeneity. ' 

® Charles Goring : ** The English Convict,” 1913. 

7 Madeline H. Whiting: ‘*On the Association of Temperature, Pulse and 
Respiration with Physique and Intelligence in Criminals,’ Biometrika Vol. 11 
(1915), pp. 1-37. 

5 Myers M. Orensteen : ‘“‘ Measurements of Cairo-born Egyptians,’ Biometrika 
Vol. 11 (1915), pp. 67-81. 

® Biometrika, Vol. X11, 1920, pp. g2=112. 

N.B.—1 may note that in many cases, the Coeff. of Variation has been 
calculated by me. 
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Risley ' published the crude measurements of 87 Indian castes 
and tribes, but he did not calculate a single frequency constant 
or a single probable error. The size of sample varies from 185 to 
2, yet every average has been given equal weight on the strength 
of his authority. The averages published in his book were in many 
cases hopelessly wrong, in one instance the difference amounted to 
no less than 60 mm. 

I have just finished calculating the irequency constants for the 
whole of Risley data for Stature. I hope to publish my results 
at an early date. Meanwhile I shall use my summary table for 
purposes of comparison in this paper. 

It should be noted that the present section was already sub- 
mitted to the press when the Mediterranean data reached me. 
The Risley data also had not then been reduced. ‘Thus the 
earlier part of the present section does not include the above two 
series of data. I have retained a portion of the older work, but 
have gone over the whole ground again with the inclusion of the 
new data. 

The Caste data of Risley is substantially differentiated from 
other samples in showing a significant lower Variability, hence 
the Anglo-Indian sample is found to be significantly more 
variable than the Indian Castes and Tribes. Otherwise the 
inclusion of the new data does not upset the earlier conclusion 
that the Anglo-Indian Variability, though higher than the general 
Variability of ‘‘ homogeneous ’’ races, is not significantly different. 
As a matter of fact Anglo-Indian Variability is just about the 
same as the Variability of European (in a geographical sense only) 
races. 


NOTE ON THE RETENTION OF CRIMINAL, DATA. 


It may be objected that a criminal population being substan- 
tiaily differentiated from the general population, it is not legitimate 
to use criminal data for comparative purposes. Wecan only reply 
that if there is any fundamental anthropological differentiation 
this has not yet been proved to be the case. On the other hand 
the bulk of available statistical evidence goes to show that there is 
no such thing as a different criminal type. J. J Craig? says of his 
Egyptan data, ‘‘it may be objected that criminality in itself is a 
determining factor of selection, but the objection does not hold in 
Egypt” and he proceeds to explain why. In the case of New 
South Wales also the same istrue. There is no significant differen- 
tiation of criminals from the general population.? 

As regards the English convict, we need only refer to the 
great work on the subject by Dr. Charles Goring (already cited 
several times in this paper). Goring comes to the conclusion that 
the Lombrosian doctrine of criminal types is false. ‘‘ Criminals as 


! “Indian Castes and Tribes,’ 2 Vols. (1904?) (Superintendent of Govern- 
ment Printing, Calcutta). 

2 Jj. 1. Craig: loc. cit. Biom. Vol. 8 (1911). 

® Goring: ‘‘ The English Convict,”’ (1913), p. 198. 
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criminals are not a physically differentiated class of the general 
community. The physical and mental constitution of both crimi- 


nals and law abiding persons of same age, stature, class and intel- 


ligence are identical. There is no such thing as an anthropological 
criminal type.'”? In view of Goring’s work we may safely include 
criminal data for purposes of comparison, at least until statistical 
evidence iu support of the Lombrosian doctrine is forth-coming. 


TABLE 5. 


Mean Stature, S.D. and Coeff..of Variation of 100 different races. 
Note.—(1) The number immediately after the name of the race gives the 


reference of the source from which material is collected (see end of table). - 
(2) Second column gives number of individuals on which the average is based. 


(3) Races italicised were selected as more reliable. It will be noticed that the - 


total number in each case is greater than 25, and the P.E. of Coeff. of Variation 
is less than °22 or J "Te 


S.D. in mm. 


: Mean (mm.) + 100 x (Coeff. of 
Name of Race. Pane el P.E. of Mean. aed of | Var. + P.E. of V.) 
I Segua (1) 12 1670° + 5°7| 29°46+ 4 a 176°42 + 24°28 
2 Digo (1) 15 1629°4 + 5°9| 33°'78+ aa | 207°32425°52 
3 Nyika (1) OR is 16581 + 6°3| 39°37+ 44 237°43+ 26°68. 
4 Comoro (1) 23 1662°9 + 59} 4166+ 4’! | 250°49 + 24°91 
5 Kaseri (1) eae 1696°5 + 8:6 | 43°04+ 6°0| 259°02+ 35°66 
6 Javanese (2) tt 17 1570°59+ 6°71) 43°3 + 6'4| 261° + 34° 
7 Kelpin (3) ae | 15 1650°00+ 9°8| 44°6 + 7°0| 270°30+4 33°28 
8 Sartkoli (3) 40 1637°7 + 6°0| 44°3 + 4°3| 270°50+20°39 
9 Nandi (1) aes 16764 + 83) 45°9 + 5°9| 274°24+ 34°95 
10 Lamu (1) | 26 1637°0 + 5°9| 44°06+ 4°2 | 274°63 + 25°68 
11 Dolan (3) 16 1641°I + 9°5 | 4610+ 6°7  280°89 + 33°49 
12 Muscat Arab (1) | 3 1648°4 + 5°8| 47°8 + 41 289 67 + 24°81 
13 Faizabad (1) eos 1669°2 +1I'O| 49°2 + 7°8  294°75+40°58 
14 Shilluk (5) ae | 17760 + 9°6 53°0 + S84 298 42 + 38°04 
15 Baganda (1) se 44 1604°7 + SI, 50°3 + 3°6| 302°104 21°72 
16 Hami {3) sat 21 | 1630° + 8°3) 49°5 + 5°9| 303°68 + 31°60- 
17 Yemen Arab (1) .. | 20 | 1647°7 + 7°6 | 50°290+ 5°4 305°22+ 32°55 
18 Swahili (1) a 53 | 1646°7 + 4°7 | 50°3 + 3°3 | 305°41420°O1 © 
19 Wanyamwezt (1) + 101 | 1764°9 + 3°S| 51°6 + 2°4| 3073541461 
20 Nissa (3) | 9 | 1022 +12°7 | 49°5 + 9°0) 30895740 ae 
21 Pakhpo (3) | 25 1604°0 + 7°O| 49°5 + 5°4 | 308°60+ 20°43 
22 Segeju (1) +s | 36 | 1631°l + 5°7| 50°95 + 4°0] 309°82424°62 
23 Chinese (3) | 20 | 1667°0 + 8°5| 51°77 + GO| 310°97+ 33°08 
24 Banjerese (2) ti 33  1569°64+ 5°71 48°61+ 4°04) 310° +426" 
25 Niya (3) a 18 1626°0 + 9°O| so°4 + 6:4 | 310°I5 + 34°86 
26 Karnaghu- -Tagh (3) 21 1660°§ + 8°3)| 52°99 + 5°9 | 318°574+33°15 
27 Canal AY rh (4) Lo) yee 1658°7 + 3°2| 54'2 + 2°3 | 326°00414°00 
28 Kababish (5) 23 170970 + 7°9| 56°0 + 5°6| 3277+ 32°58 
29 Cutch (1) ahs 24 1633°O + 7°4) 54°l + 5°3| 331°3143225 
30 Nejmps (1) ate 11 1723°l +11°7 | 57°4 + 83] 333°1344790 
31 Khotan (3) = 67 1655°2 + 4°6 55°5 + 3°2 | 335°304+19°53 
32 Punjabi (1) 60 1683°8 + 5°O 57°2 + 3°5 | 339°41420°89 — 
33 Bantu Kavirondo (1) 24 16926 + 7°9| 57°4 + 5°6 | 330°13433°01 : 
34 Minia (4) és 49! 1669°70+ 1°7|) 56°O + 1°2 | 339°00+ 7°00 
35 Sundanese (2) 084 37 1591°30+ 6°00 54°07+ 4°24) 340° +27° 
30 Kamba (1) ow} @ 298 16566 + 3°4| 56°O + 24 | 341°92414'41 
37 Turfan (3) 72 | 1662°6 + 4°5| 57°0 + 3°20) 342°383+19°27 ~ 
38 Beheira (4) o+ | 525 16768 + 17 | 57°4 + I 


‘2 | 342°00+07°00 


' Goring: /bid., p. 370. 
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SET 


Col. 2 : S.D. in mm. 
ts R : Mean (mm.)+ 100 x (Coeff. of 
Name of Race. Sie P.E. of Mean.| +E°2 °F Ivar. 4 PLE. of V.) 

39 Biloch (t) f 15 | 1649°7 + 9°9| 56° + 7°0| 343°34+42°27 
40 Duruma (1) 67 1649°2 + 4°8! 57°7 + 3:4] 349°60+ 20°37 
Al Arab and Swahilil (32 32 1644°6 + 69] 57°7 + 4°90] 350°574+20°55 
42 Giza (4) aE 326 1678°O + 2°2 58°38 + 1r'6| 350° + 9° 
43 Chitrali (3) . 22 1684°5 + 8°1| 59°3 + 5:8.) 352°03 + 35°79 
44 Qena (4) -: 824 16780 + 14) 59°04 IO'yg52°. + 6 
45 Bent Amer (5) . 51 TO4G7 E57 | §St Ae | 35300 £23757 
40 Girga (4) = 610 10777 ft WO) Sora + 151 | 353 ee 7 

7 Fayum (4). oi 413 TO7AGL E20 |a59°2 4 bea 354! oe 8: 
48 Polu (3) Sytolige eau OF 2 7:0.) 58:3 + AO | 354557 4030" 37 
49 Bent Suef (4) oe eA ood. | 1662°3 + 2°0] 59°1 + 1°4] 355° + 9° 
50 Gharbia (4) Bey, EPOG | OLOG 308i 2/504. + O'9'| 355100475: 

31 Masai (1) QI hal 7vOOrOn 4s 3\260r7 4 63°O | 357°0845 17°85 
52 Hadentoaand Amara (s 54 1070" fens {Gor £4. i] 35790 23°23 
53 Aksu (3) 13 163777 +106 | 58°5 + 7°5 | 357°20447°25 
34 Sheher (1) es 82 1615°7 + 4°41 57°99 + 3°1 | 358°43418°387 
55 Alexandria (4) ai 643 160672, = 1-6) 50°7- F" I°r | 359°0" + 57° 
56 Kékyar (3) 24 37° 1629°'2 + 6°3| 58:0 +. 4:4 | 361-524 28°34 
57 Giriama (1) Wey 24. 1629°7 + 81] 589 + 5°7| 361°590+ 35°20 
58 Dagahlia (4) ain an SOF 1660°6 + 1°8| 60:0 + 1°3/} 361° +408 
59 Asstut (4) a2 BS 1668°9 + I°4| 160°3+ 1°0| 362 +406 
60 Cairo (14) ae 802 1682:0) 5 1-4. 60" 3" 4 1-0) 304" 06" 
61 Wakhi (3) I 1680° + 8:8] 61°8 + 62] 367°84-+40°25 
62 Camb. Students (11). 1000 1748°88+ 1:4 | 646 + 097] 369°58+05°58 
63 Ajawa (1) ae 16 EOS 22) 1073) WGD-2. +... 773) 3707484 64°17 
64 Aswan North (4) .. II5 1683°3 + 3°9| 62°33 + 2:8 | 370°:00+ 16:00 
65 Menufia (4) = 718 BORG Oi EOP O2S hae ob eS 7LS tS ate 78 
66 Embu (1) a 110 1OZO-LA + 2704012, 22.0 | 375.50 + 17°07 
67 Kafir (3) AS ae 1667°8 + 9°0 | 63°3 + 6°4| 379°54+ 42°66 
68 Manvema (1) SEA Foomae 1667°5 + 66 | 63°2 + 4°7 | 379°28+4+27°91 
69 Kikuyu (1) Saint Aes FS 1640" - 2°2 | 62:5 + 1°5| 380°98+ 9°27 
70 Qualiubia (4) ae 295 EtGo2-4 et (25 Og7r fois | 280° + 107% 
71 Shargia (4) 516 TOD GQ: | 63°35) En k-3 || 382>°. 48 
7a Us. A - Recruits (6) . . |25,808 1709°4 + 0°27] 65°6 + org] 383°764 I-15 
73 Nuer (5) 39 E8000) es SFO 70" 445 387°59 + 29°60 
74 N.S.W. Criminals () 2871 1698°8 + 0°83) 65°3 + 058] 387°33 +0345 
75 Nyasa (1) 21 1640°O + 9°4| 637 + 7°3 | 390°27 + 40°61 
76 Keriya (3) at 21 1612°5 + 9°3| 62°99 + 65 | 390°07+ 49°59 
77 Sukuma (1) « 21 17170 + 9°99] 67°73 + 70| 392°01+ 42°80 
78 Kirghiz (3) rs) 38 16408 + 62] 64°6 + 4°4] 393°71 + 30°46 
79 Somali (1) Sot haar, 17351 + 76 | 686 + 5°4] 395°25 + 36°27 
80 Suk (1) pact 15 1677°9 +11°6 | 66°3 + 8:2 | 395°2 +4866 
81 Eng. Sons (9) 1078 17440 + 1°42| 694 + 1°70] 395 + 6 
32. Eng. Fathers (9) Peto 7e I719°5 + 1°39] 68:7 + 1t'°0| 399 + 6 
83 Germans (10) ah 3300 1659°3 + 2°3| 66°3 + 1°6| 402°37+ 10°38 
84 Eng. Criminals (11) 3000 | 1658'r + 16) 68:07 4.172) 411° +:9 
85 } Nilotik Kavirondo (1) 37 | 172970 + 7°99] 714 + 5°6| 412°81 + 32°36 
86 Loplik (3) 38 1695°0 + 6'2 | 70°3 + 4°4| 414°74 + 32°08 
37 Barabra (5) 70 | 1680 + 7:0| 70 + 3:7| 416°66+4 23°75 
88 Kachamega (1) I0o— |,:«1668'3 + «4'7 | 69°8 + 3°3 | 418°69 + 19°96 
8g Kamasia (1) 20 1719°8 +10°9 | 72°4 + 7°7| 420°91+ 44°89 
90 Aswan South (4) 95 16506 + 4°8| 694 + 3°44] 421 +21 
gt Mastuji (3) 28 | 16661 + 7°2] 70'4 + 5°1 | 422°54+ 38°08 
92 Korla (3) ce 14 1667°9 +10°2] 706 + 7°2 | 423°28453°95 
93 Scot. Insane (7) ....| 4381 1673°3 + 0°73) 72° + 0°52] 430°95+ 3°10 
94 Scot. Total (7) .> |. (440% 1668°8 + 0°75] 73'7 + 0°53] 441°40+ 3°17 
93 Bagh-jigda (3) a 12 1647°5 +11°0| 73°2 + 7°8 | 446°30+ 61°17 
96 Charklik (3) 5% 12 1678°3 +110] 74°56 + 78]| 44628 +61'44 
97 Chaga (1) uo 18 1641°6 +12°:2| 76:96+ 8:7 | 468°82 + 52°70 
98 Rabai (1) 13 162671 +14°5| 77°4 + O-2| 476°41+63'01 
99 Turkana (1) 9 1694°4 +19°4| 86:1 +13°7| 508°16 + 80°78 
100 Dinka (5) 116 | 1786° + 6°0 97°0 + 44| 543°114+24°04 


eee 


66 Records of the Indian Museum. [VoL. XXIII, 


Supplementary List. 


In this List actual Coefficients of Variability are given. 


Col 2 S.D. in mm. 
- |Mean(mm.)+ | 100 x (Coeff. of 

Name of Race. No. in | +P.E. of 

Sample. P.E. of Mean. | SD. Varta of V.) 

101 Crete, whole Island | ) 

444 pe Site 318 1706't +2°6 | 67°§S +18 | 3°96 + "12 
102 Eparchies (Selinos, 

Sphakia) (12) .. 50s |:1752°6 45°4 | az +39 | 3°264 22 
103 Albanian (12) 0 ee MD | 1693°2 #3°7. | 65°77 +26 3°88 +18 
104 Cyprus (whole  Is- 

land) (12) oi 585 | 1687°7 +1'°7. | 616 +1°2 ) . 364407 
105 Cyprus (Nicosia) (12 cree 16788 +3.9 | 60°5 +2°7 3°60 + “16 
106 »» (Lapitho) (12).. 221 1680°0 +2°5 | 54°77 +1°8 3°25 +°10 
107 »» (Ekomt) (12) .. 167 1690°5 +3°2 | 60°3 +2°2 3°59 +°13 
108 ., (Levkontka) (12) 87 1689°38 +4°6 | 63°7 +3°3 377+£°19 
109 Cyprus (Leukas) (12) 42 1668':0 +6°7 | 64°3 +4°7 3864 °33 
110 Lycian Gypsies (12).. 53 16602 +44 | 47°38 +3°T 2°88 + °20 
111 Persian Jews (12) .. 57. —«| :1643°5 +5°2 ead 3°53 4°22 
112 Yemen Jews (12) .. 78 15940 +2°9 | 3°76+°20 
113 Samarkand Jews (12) 100 1064°2 +3°9 Minis 3°§524+°17 
114 Oxford students (13) 959 1765 66:08 + 3'7439 
115 Aberdeen students | 

Sos eae “ted aoe | 171770 £1°8 | 594 £33 3°4595. 


(1) Leys and Joyce, Four. Roy. Anthrop. Inst., Vol. XLIIL (1913) p. 216. 
(2) Garett, Four. Roy, Anthrop. Jnst., Vol. XLII (1912), pp. 60-66. 
(3) Joyce, Four. Roy. Anthvop. Inst.. Vol. XLII (1912), p. 473- 
(4) Craig, Biometrika, Vol. 8 (1911). p. 75: 
(5) Seligmann, Four, Roy. Anthrop. /nst. Vol. XLIII (1913), pp. 700-702. 
(6) Pearson, Phil. Trans. Roy. Soc. Vol. 184A, p. 386. 
7) Tocher, Biometrika, Vol. 5 (1906-7) p. 307. 
(8) Powys, Biometrika, Vol. 1 (1901), p- 44. 
(9) Pearson, Biometrika, Vol. 2, (1903), p- 379. 
(10) Pearson, Chances of Death, Vol. 1, pp. 294-2090. 
(11) Macdonell, Biometrika, Vol. 1 (1901) pp. 191. 
(12) Buxton, Biometrika, Vol. 13 (1920), p. 104 and p. 108. 
(13) Schuster, Biometrika Vol. 8 (1911), p. 49- 
(14) Orensteen, Biometrika, Vol. 11 (1915), pp. 67-81. 


TABLE OF VARIABILITIES. 


_ There are several remarkable points about the ‘Table of Vari- 
abilities. The material is supposed to be homogeneous in each 
case, yet we note the extreme range of variation of the coeff. of 


variability. We have 1°76 42+ 24 28 and 5°08 16+°80 78 as our. 


extreme values. ; 

The mean variability is very near 3°6, and one very remarkable 
fact is this, that— 

I. The more highly civilised races have greater variabilities 
than the average. 

This confirms Pearson’s result for Cephalic Index.' Pearson 
concludes for Cephalic Index that greater variability is a characteris- 
tic of the ‘‘races which have been successful in the struggle for 
existence, and at the present time are the dominant races of the 


' Chances of Death, Vol. 1, p. 292. 
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earth. At the same time the greater variability of the more domi- 
nant and civilised peoples admit of being interpreted as a result of 
the lesser severity of the struggle for existence among them. ‘Thus 
greater variability would be an effect not a cause of the higher 
state of civilisation.” 

Another fact which may be gathered from the above table is 
this. The more civilised races though more variable, do not in 
any case occupy the extreme ends of the table. Thus one would 
probably be justified in inferring that a higher state of civilisation 
is not associated with extreme degrees of variability. 

We may look at the same question from a different point of 
view. ‘The less civilised races occupy the extreme ends of the table 
more frequently than the more civilised races. The less civilised 
races though onthe whole less variable, may thus be associated 
with extreme degees of variabilities. 


Il. The greater variability of more highly civilised races seems to 
be only moderate in degree and is never excessive.! 

It seems as if slightly greater variability than the stable type 
of the species is accompanied by greater adaptability and hence 
with a higher state of progress. | 


INTERRACIAL VARIABILITY. 


There is another point which deserves attention. By looking 
at our general list of variabilities, we find some association 
between average stature (M) and standard deviation o. 

The point which we are considering now is interracial correla- 
tion between M an o for the different races. 


If fy =S(xy)/N , 
then the correlation coefficient as determined by the product moment 
method,’ is given by 
r¥=11\/(or . Fy) 


where vz and ¢, are S.D. of the two variables. 

I find, without grouping, with base numbers 1660 mm. and 
60 mm. respectively for average stature and §.D. the raw mo- 
ments to be :— 


For Stature iy = 5 BA vy," us 1389°48 


! In the selected list (see below) this fact is not so apparent. It seems as if 
the extremely high variability of less civilised races is due to unreliability of 
data. 

2 This is quite distinct from the zxtra-vactal (or.within the race) correlation 
between errors in Mean and errors in S.D. 

In Biom. Vol. 2 (1903), Problem IX, p. 279, is shown that 


ite. oas/(o . ee . N) 


In our case, ug is negative, hence a taller subsample of Anglo-Indians will 
show less variability and vice versa. This is actually the case with the two 
subsamples we have already considered. The subsample with a higher average 
1658°75 mm. has a S.D. of 68°85 mm. as against the other with Mean =1657'00 
mm. and $.D.=73°26. 43 being small, correlation however, is very small. 

® See Yule: ‘ Theory of Statistics, ’’ p. 171. 
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Standard Deviation (base No. 60) 
y= ey Ve =Q2'29 
and for v);' = 110704. 
Transferring to Mean, we get :— 
For Stature :— 


Mean value of Average stature = 1665'24 mm. 
Standard Deviation = 36°90 52 mm. 
Coeff. of interracial Variability = 2°21 62 


For Standard Deviation :— 


Mean value of Standard Deviation= 59°23 mm, 
S.D. of Standard Deviation = 9°57 58 mm. 
Coeff. of interracial V of S.D. =" eee 
We also have Hy,= +114°07 38 
Thus Ronee +114'07 38 

*" 30°90 52x 9°57 58 

= +30 98. 
"67449 


The Prob. Error’ of R is given by ——=(1- R’) 
Sn 


From Abac in Biometric Tables p. 19, we find for N=100, 
P.E. of R=:062., 

Thus Ry, ¢ = 3098 + "062 

We may now consider the correlation for our selected list of 
55 veltable samples. 


Stature :— ; 
Mean value of Average Stature =16 63°94 54 mm. 
Standard Deviation of AverageStature= | 36°59 53 mm. 


Coeff. of Variability (¢nterracial) = 2°19 93. 
Standard Deviation — 
Mean value of Standard Deviation = 59°34 53 mm. 
S.D. of Standard Deviation = 6°46 mm. 
Coeff. of Variation (interracial) = 1086 
fy= + 125°889 
Thus Ry. ¢ = +°3283 4'082 


Selection of more reliable values does not make any sub- 
stantial difference. We may therefore conclude that there is a 
positive interracial correlation of about +°3 between Average 
Stature and Standard Deviation. 


i 


! K. Pearson and L. N. G. Filon: ‘Probable Errors of Frequency Cons 
stants "' ete., Phil. Trans. Vol. 191A, pp. 231—241. Se es 
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Interractally, taller races ave on the whole more variable than 
shorter. 

It will be noticed that the average stature of all the races is 
1665°24 and in the case of selected races, 1663°94 mm. 

The Anglo-Indians are thus slightly shorter than the general 
average of all the races. But the difference is only about 7 mm. 

In this connection it is interesting to compare the figures 
given by Tschepourkowsky.' He finds for 92 Russian races the 
mean value of average stature to be 1647°4 mm. and S.D. 33°3 
mm. while for Deniker’s 84 living races, the values are 1639°6 
and 55°9 respectively. 

His coefficient of interracial variation of stature is 2°02. In 
our series of 100 races it is slightly higher, being about 2°22 but is 
of the same order. 

Thus our value of interracial variability agrees generally wath 
a previous value found independently by another worker.’ 

We can now pass on to the question of interracial correlation 
between M and V. 

If 4. v,, vz, vy, are the variabilities of %,, %, %3, %, and 7 9, 7)3 
are the correlation between ¥, and *,, x, and %3, etc., then the correla- 
tion between *i/,, and *2/7, has been shown ? to be 


ne V 90) V2 — 140, V4 — 7930002 + 13,204 
aap a ene 2 2 
VJ (01° + 04° = 27,30, 0,)(0," + V4" — 27, 40,,04) 


We get correlation between M and V =1000/j7 by putting 
Re — kas — 0, 
Then 
= oi Js: =O, to Os 
Viyg=Vyy HL, Vi2=%y0- 


Thus 
a 7,0, — Vj 
M,V ie ECT ogee ETS 
V2, + V2 = ZY \9V Vo 


For the Whole Series of 100 races :— 


Vv, = 2°21 62 

Vg =16°17 

V12= +°30 98 
Hence pu,v= +°17 87+°'065 


For the Selected Series of 55 races :— 


Oh A F998 
Vz, = 10°89 


%.! E. T’schepourkowsky: “Contributions to the Study of Interracial Corre- 
lation.’’ Biom. Vol. 1V (1905), pp. 286—312. 

_® We may note however that the interracial variability is higher in our case. 
This implies that our sample of races is more representative in character than 
‘Tschepourkowsky’s. 

3 KX. Pearson: ‘‘On Spurious Correlation,’ Proc. Roy Soc. Vol. LX. 
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r= +°32 83 
P,, oo + 1303 + 088 


The correlation in the latter case is scarcely | 
but seems to be slightly positive. 

Thus there seems to be a small positive snternddeai 
between the average stature and the coefficient of variation. 

Assuming recent races to be more variable, th 
interracial correlation between stature and variabil 
explained on the hypothesis that tallness is a recent ¢ act 
of the human species. The greater variability i is not ¢ 
to the greater absolute size of the taller races, since 
of variability i.e. proportional variability itself is also 
correlated with stature. 
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SECTION VII. COMPARISON OF VARIABILITIES. 
STANDARD DEVIATION OF STATURE. 
(a) The Whole Serves. 


Let us consider the 100 different values of Standard Deviation 
of Stature, which I have collected for purposes of comparison. 
We notice ‘the great range of variation of the S.D. a extreme 
values are 29°5 mm. and 97:0 mm. 

Grouping by units of 5 mm. we get the following distribu- 


tion :— 
Distribution of 100 S.D. of Stature. 

| | | : 
Group .. -- | 29 | 34} 39 | 44) 49 | 54] 59 | 64 | 69 | 74 | 79 | 84 
to | to | to | to to to to _ to to | to; | to |: to 
| 34 | 39 | 44 | 49 | “54 | 59 64 | 69 | 74 | 79 | 84 | 89 
SS can | ee eee ee 
Frequency .. <s | 25140 | Aho Gol E3554 22.2355 | TOMES 3 fo) I 


We get 
Mean Value of Standard Deviation=59°45 mm. 
S.D. of Standard Deviation 2 OP aa a3 gy aril: 


P.E. of Mean Standard Deviation = 42 12 


We can now compare our Anglo-Indian $.D. with this Mean 
Value :-— 
Anglo-Indian $.D. =67°38 mm. 


Mean value of §.D.= =59'4 45 mm. 
Difference 7° 7-93 + 6°42 mm. 


The difference 7°93 + 6°42 mm. is not at all significant. We 
can find the ape en of this difference, 


fe we 4 935 : 

Cee =O; 83 approximately 

9'524 ce ; 
From Tables II,p.2 3(1+4) = ‘79 67 30 6 
4(I-—a) = ‘20 32 69 4. 


If we assume that our sample of 100 standard deviations is a 
random or representative samples then 20°3% of all ‘‘ homogene- 
ous” races will have a $.D. greater than the Anglo-Indians, and 
40°6% will differ more from the average value than Anglo-Indians. 

For Stature, the absolute variability (Standard Deviation) of 
Anglo-Indians ts thus not significantly g greater than the average absolute 
variability of homogeneous races. 
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It will be noticed that the list contains many small samples. 
It will be better to omit all samples of less than 25. Doing this 
we find that extreme values have been mostly eliminated by this 
process of selection, showing that such extreme values were 
probably in most cases due to uncertainty of sampling rather than 
to any peculiarity of the population. 

I have also thought it best to exclude Scottish Insane as well 
as the Dinka group. We have already seen that Anglo-Indian 
variability is not significantly greater than the average variability 
of the whole series. The inclusion of any variability greater than 
Anglo-Indian variability will strengthen this conclusion, rejection 
of greater variabilities will go against our conclusion. The 
Insane is manifsstly abnormal and may be neglected for the 
present. Variability of the Dinka people is greater than that of 
Anglo-Indians, its rejection will thus make the test more rigid. 
Separate figures for Aswan is also omitted for similar reasons ! 


For the selected series of Standard Deviations 
Selected Mean Stand. Dev. =59°8929 mm. 
S.D. of Standard Dev. = 63504 mm. 


We notice that the selected Mean is almost exactly the same 
as the Mean for the whole series. We conclude that 60 mm. is 
about the true average absolute Variability of stature for human races. 

Due to selection the $.D. of Variability is considerably re- 
duced because the extreme values of Variability have in most cases 
been eliminated. 


Anglo-Indian $.D. =67°385 mm. 
Selected Mean $.D. =59°893 mm. 
Anglo Indian Difference = 7492 mm. 


We find the probability :— 


From Biometric Tables II, $(1+4)= °88 09 99 9 
\(I1—a)= ‘II go 00 I 


Thus 11°9% olf all races will have greater variabilities than 
Anglo-Indians while 23% will differ more from the Selected Mean. 

As judged by a reliable series of standard deviations, the Abso- 
lute Variability of Anglo-Indians is not significantly greater than 
the Average Variability of different ‘‘ homogeneous ’’ samples. 


RELATIVE VARIABILITY OF STATURE. 

We shall now compare the Kelative Variability (as measured 

by the Coefficient of Variation) of our Anglo-Indian data with the 
variability of samples recognised to be homogeneous. 


| J. I. Craig: Biometrtka Vol. 8 (1911), p. 70. 


: 
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(a) Whole Senes. 


Distribution of 100 Coefficients of Variation of Stature. 


| | | | 
| 1:80 | 2°20 | 2°60 | 3:00 | 3:40 | 3°80 | 4:20 | Beyond | Total. 
Group wif, 20 to to to tOP y ee LO to | 
2°20. |- 2°60 | 3:00 | 340 t.3°80 14°20 | 4°00") < "4°60 


Frequency ..| 2 3 9 2OAR We sarGr. | 205 | 8-0 | 3 . 100 


Grouping in units of 4, we find moment-coefficients about 
the Mean, 3 


Lo= 1°88 66 

H3= —'77 80 

Py=IVOI 74 74 
giving B\= °09 46 73 

Ba= 3°37.20.20.4- "63 30 
with sk. = +13 448 © 

Mean Coefficient of Variation =3°5700 

and S.D. of Coefficient of Variation= °5450 


Curve belongs to Type IV, but the Gaussian itself will be 
quite adequate. 


‘< Goodness of Fit’? of Coefficients of Variation. 


| Observed | Theoretical : (m—m’)2 


Coeff. of V. | F m—m SERIE ee 
e : g : m.. | Mm. | mM 
| 
: oar we 
Beyond 2°20 2 "742 1'258 20320 
2°20—2'60 | 3 3°538 538 0818 
2°60— 3°00 9 II*512 2°512 5481 
3°00— 3°40 20°5 22912 | BAW "2531 
3°40— 3°80 34°0 27°934. -6°066 13570 
3°80—4°20 20°5 20°769 “230 "0028 
4°20—4'60 | 8-0 9°459 £°459 °2250 
Beyond 4°60 | 30 3°130 "130 "0054 
c= 4°5660 
n =8 x? = 4°566 
Piz o9t Od, 0 


Thus the Gaussian gives excellent fit. In seven cases out of 
ten, the fit will be worse. 

We notice that one terminal frequency gives rather a large 
value i.e. 2°1320, combining the two end groups, we get, 


x? = 2°555 
yas P= -85 45. 87 


The fit is now considerably improved. I conclude that the 
Coefficient of Variation (for homogenous groups) can itself be gra- 


74 Records of the Indian Museum. [VoL. XXIIT, 


duated by the Gaussian curve. Wecan now safely apply the theory 
of Errors (which is based on the Gauss-Laplacian Probability Integral) 
to judge the likelihood of deviations from the Mean. 


Anglo-Indians V= 4°0672 
Average Y= 3°5700 


Anglo-Indian Difference = ‘4972 
Now the §.D of V= , 5450 
Thus, P.E. of V = +3676 


Anglo-Indian Difference = °4972+°3676 
x D —_ al® = ‘9 
@ *5450 


From Biometric Table IT, 4(1+a)="81 85 88 
4(I—a)='18 14 12 


‘thus we find that no less than 18°14% of “‘ homogeneous ”’ 
races will have Jarger Coefficients of Variation than Anglo-Indians. 
The Anglo-Indian Coefficient of Variation is not significantly 
greater than the average Coefficient of Variation of the whole serves. 


(b) Selected Series. . 
We obtain the following distribution of the Coefficients of . 
Variation for 55 selected! races (unit of grouping ="2). 


Distribution of 55 selected Coefficients of Variation. . 


oF > O18 2 | oS |, a8 | 37.) 39 Aas 
Group -0 4 CO? “gh peo to | to to | -to to to Total. 


2°>9; 31 oa a5 SF PS ae 4°3 
Frequency ee 3 5°5 I°5 9°0 17°5 | 9°5 4°0 me) 55 
We get, 
Mean Coefficient of Variation 


| 
br 
w un 
an 
OO rm 
fo) 


Standard Deviation of Coefficient of Variation 
P.E. of Mean V 


The other constants are :— 
Mg= 3°22 20 45 
Me= 1°59 62 o1 
4,=29°89 12 68 
B,= 06 61 53 
B,= 2°97 93 


| 
ws 
tk 
N 
an) 


oe. a 


' It will be noticed that the extreme values have been automatically excluded 
by our principle of resection of unreliable values. 


1922.] . P. C. MAHALANOBIS: Analysis of Stature. 75 


The stability of the Mean Value is remarkable. For the whole 
series it was 3°57, for the selected races it is 3°571. It therefore 
seems likely that 3°57 1s very near the true typical coefficient of 
variation (of stature) for homogeneous non-Indian samples. 

The S.D. is much reduced by selection. This is now *3590 
as against *5450 for the whole series. We have selected the more 
reliable values, but this has also excluded almost all extreme 
values. Great divergence from the Mean value is thus probably 
due more to paucity of material than to actual peculiarities of 
distribution. 

Anglo-Indian V= 4°0672 
Selected Mean V= 3°5710 


Anglo-Indiar Difference = °4962+°2421 


The actual difference is again the same, but this is now 
seatly twice the Probable Error. 
We have, 


‘Go 
ee 1°38 approximately 


From Biometric Table II, 34(1+2)= ‘91 62 04 7 
3(I—a)= *08 37 95 3 


Thus 8°38% of all reliable samples will actually be more 
variable than Anglo-Indians, while 16°55% will differ more from 
the Mean. 

Anglo-Indian Variability of stature 1s not significantly higher 
than the average Variability of selected samples. 


(c) Selected and Weighted Series. 


Still another course is open to us. We can consider the 
“ weighted Mean”! and “ weighted’’ Standard Deviation of the 
Coefficient of Variation. For this purpose, we choose our weights 
to be proportional to 1/E*, where E is the probable error, i. e. 
give ‘‘ weights ’’ proportional to reliability. 
We get, 
Weighted Mean V =3°7622 
Weighted S.D. of Mean V= ‘1846 


We notice that the Mean is now considerably higher. This is 
due to the much greater reliability in the measurements of the 
more civilised races, who have invariably higher variabilities. This 
greater value is also due in a large measure to the weight of the 
U.S.A. recruits (w=7623 against 10 for the lowest weight) which 
includes 25,898 individuals. 


| See Yule: ‘‘ Theory of Statistics ’’ (Charles Griffin, 1919), p. 220. 
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Anglo-Indian V= 40672 
(Selected Weighted Mean V=  3°7622 


Anglo-Indian Difference = °3000 
P.E. of Difference =+ ‘1245 
D : 
%=—=1'63 approximately 


From Biometric Table II, $a@+4)= -94 84 49 3 
3(I—a)= °05 15 50 7 

51% will be more variable, while 10°2% will differ more 
from the weighted average than Anglo-Indians. 

Thus even when compared with the weighted Mean, Anglo-Indian 
Variability is not significantly greater than average Variabiltty. 

We have seen that U.S.A. recruits raise the weighted Mean 
very considerably. But it is not at all certain that the recruits of 
the U.S.A. Army are possessed of any great degree of homogeneity. 
One would surmise rather that they are heterogeneous in character. 
I,et us see the effect of leaving out U.S.A. recruits. | 

Omitting U.S.A. recruits we get 

Weighted Mean V= 3°6413 
Weighted $.D. of V= = +2509 
Weighted P.E. of V=+ °1683 
Anglo-Indian V= 4'0672 
Weighted Mean V= 3°6413 


Difference "4259 + 16°83 


From Biometric Table II, $(1+@)="95 54 34 5 


3(I—a)="04 45 65 5 


4°5% will have greater Variabilities than the Anglo-Indian 
sample. As regards the Coefficient of Variation, this is the most 
stringent test we can apply with the non-Indian material at our 
disposal. We find 

Anglo-Indian Variability is within the limits of probability of 
homogeneous Variation. Study of the Coefficient of Variation for 
Stature does not enable ws to assert definitely that the present Anglo- 
Indian sample is heterogeneous tn character. 

I shall how consider the whole series of non-Caste samples 
including the Mediterranian samples. I have omitted the separate 
age-groups for the New South Wales Criminals and the Oxford 
student data. As all these have greater Variability than the 
Average, the stringency of our test will not be diminished by this 
rejection.' Another reason why I have omitted the different age- 


' See discussion on p, 72. 
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groups is this. My purpose is to compare the Anglo-Indian 
Variability with the general average Variability of other races. 
If the Coefficient of Variation for the same race is given several 
times over under different age-groups, too much weight will 
obviously be given to this particular race. I also omit Dinka.! 


Distribution of Coefficients of Variation of 107 non-Caste Samples. 


Be- 1°80 | 


j } / : | i 2 | 
. | 
| 
i 


| / | 
FOO; 2700, | 2.10 | 2°20, | 2°30 | 2°40. |- 2°50 | 2°60 
yond to Be eo ee coe Shot “Pe "tO. ete to” | “to 
Group -- | 1°80 | 1°90 | 2°cO0 | 2°10-| 2°20 | 2°30 | 2°40 | 2°59 | 2°60 | 2°70 
Frequency .. I os. 3 I Get tre Shs O Phe Nea’ 
| | { Fy 
Group ae | =2°80 | —2'90 | —3°CO | -3°10 | —3°20 | —3°30 | —3°40 | —3°50 | —3'60 | -3°70 
= ee = 2 : eee 
Mrenuecncy ..} 4° | 3 | 3 | 8 4 3 5 a nhas 9 
| | 
; i | 
Group .- |-—3 80 | —3°90 | —4°00 | —4°IO0 | -4°20 | —4°30 | -4°40 | -4°60 / —4°80 -—5°00 
= / a | 
Frequency 7 8 | 9 aah lla 4 oc ee ee Oe I 


| | 


| Grouping in units of ‘1, I find 
y= 28°99 60 64, 


pg= —50°77 00 IT, 
y= 325208 73 71 

Thus B= "10573 and 
B,= 3°868 


Curve is of Type IV, but to a first approximation we can 
apply the “‘ normal’’ curve of errors. 

Mean Coefficient of Variation (107 samples) = 3 5353+°0348 

Standard Deviation of Coefficient of Variation= °5385+'0245 

The Mean Value is slightly lower than the one found earlier. 
This is due to the fact that I have omitted the Dinka group here. 
If we include the Dinka group, the Mean Value would be raised 
te 3°553 which compares favourably with the value 3°570, a 
difference of ‘035 only. 


Anglo-Indian Coefficient of Variation = 4°0672 
Mean Coefficient of Variation | 


an ie ee 
Anglo-Indian Difference = (5450: 
D _ 5319 
BT a 


From Biometric Table II, 4$(1+a)= 83 84 217 
3(I - a@)="16 15 783 


1 See discussion on p. 62. 
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Thus as betore the Anglo-Indian sample does not seem to be 
significantly more variable than homogeneous samples. About 
16%, of homogeneous samples will have a greater Variability. 


(b) Selected Serves. 


Let us now select samples greater than 25. We get a total 
(omitting different age-groups) of 67 samples distributed as fol- | 
lows :- 


Distributions of 67 Selected Coefficients of Variations. 


ae OE OS 


| Be- | | 
| yond | 
| 4 
Group .. | 2°70 ; -2°80 | -2°90 | —3°00| -3'10 | -3'20 | -3°30 —3°40 3°50 | -3°60 
Frequency .. 2 2 I 5 | x | 2 2 5 15 . 7 
/ { 


a 


) ) 
Group -- | -3°70 | —3°80 | -3 90 | -4°00 | -—4°10 | -4°20 | -4°30 ‘Total, 
ge ee ee, OPN Ma Mie, Si Be AS le 
Frequency 32146 418 ¢ eg See I I | 67 | 
: : | 
We get, Pg= 12°97 78 82 


¥3= 17°77 40 69 
¥4=477°70 38 55 

giving B= *I4 43 944°12 47 
B,= 2°83 53 67+°38 28 


Graduation by the ‘‘ normal” curve is thus possible and we 
are justified in using the “‘normal’’ Probability Integral. 
Mean Value of Coefficient of Variation = 3°5843 +0297 
Standard Deviation of Co-efficient of Variation= *3602+°02I0 

It will be noticed that the Mean Value 3°584 is sensibly the 
same as we had obtained without including this Mediterranean ~ 
data e.c. 3571. The difference is only ‘or3 while the probable 
error is certainly greater than ‘03. Thus 3°58 may be safely 
taken as a standard value for the Coefficient of Variation for 
Stature of homogeneous non-Caste samples. 

The mean value for the whole series 3°5353 is smaller than 
the mean value for selected samples, 3°5843, because in small 
samples the dispersion is more likely to be smaller.! 

Let us now compare the Anglo-Indian Variability with the 
above Mean Variability. 


Anglo-Indian Coeff. of Variation = 4-06 72 
Mean Selected Coeff. of Variation = 3°58 43 


Anglo Indian Difference =0'48 29 


! For a discussion of the dependence of Standard Deviation on the size of 
sample see Brometrtka Vol. 10(1915) p. 522 and Vol. 11 (1916) p. 277. 
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ie Or Aban 72" 
Hence Se pba 34 
From Table II, 4(I+a)= ‘90 98 773 


\4(I1—a)= ‘09 OI 227 


Thus nearly 9% of homogeneous samples will have a greater 
Variability. The inclusion of the new Mediterranean series does | 
not affect our previous conclusion. 

The Variability of the Anglo-Indian sample though higher than 
the Average is not excessively so and the difference is not statistically 
significant. 

INDIAN CASTE VARIABILITY, 


(a) Whole Series. 


I shall now consider the Coefficient of Variation of the Indian 
Caste data of Risley. Omitting 3 tribes in which the sample 
consists of only 2 individuals I get a total of 84 Castes and Tribes. 


Distribution of 84 Caste Coefficients of Variation. 


mrequency «.. [5 


| 
Group —2°I | -2°2 | -2°3 | -2°4 | -2°5 | -2°6 -2°7| -2°8 | —2°9 | —3°0 
Frequency 2 I 2 G I O 1 3 | 8 | a 
; | 
Group 7+ | ~3'1 | he gee nd Rag | Sse Onl sey) Seas oO | AO 
mee | 
| 


12 | 73 7 


Grouping by ‘1, I get 
Mean Value of Caste Coefficient of Variation =3'2305 
Standard Deviation of Coefficient of Variation = °3943 


Anglo-Indian Coefficient of Variation =4°0072 
Anglo-Indian Difference = 3277 
| Pe 8277 
So = = 2'009, 
Te | D945 


From Biometric Table II, 4(1+@)="98 21 356 
| 4(1—a)="o1 78 644 


Only about two per cent of Indian Caste samples will show 
greater variability. It seems therefore likely that the Anglo- 
Indian sample is really differentiated from the Indian Castes in 
showing a just significant degree of greater variability. 

It should be noted that the Caste Variability is much lower 
than the non-Caste Variability. 


We have | 
Non-Caste Variability =3°5700 +°0368 
Caste Variability 3°2395+'0290. 


Caste Difference = °3305+'0422 
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The difference is nearly eight times the probable error of the 
difference. Hence we conclude that Caste Variability is s¢gm- 
ficantly lower than the Average Variability of other homogeneous 
samples. 

It is interesting to find that while the Anglo-Indian sample 
is not sifnificantly more variable than non-Caste samples, it does 
seem to be just significantly more variable than Ceste samples. 

The Anglo-Indian sample is ‘‘ mixed” from a Caste stand- 
point but is not so from the standpoint of ordinary stable popu- 
lations. We shall see later that the Anglo-Indians are about as 
variable as modern European samples. 


SS ee 


(b) Selected Indian Castes. 


I now select samples of 25 and more from the Caste data. 


Distributions of 70 Selected Caste Coefficients of Variation. 


; | 
Group —2°5 |-2°6 27 -2°8 | -2°9 -3°0 


/ 
| 
Frequency .. O he | 2 
: 


Total. | 


Group <2 {$32 1-3°3 1 “3:4. | -3°5 3 pe. 


| 


re 


— _ $$$ —_— - = — ——— 


Frequency .. II 12 6 6 5 3 | o 


With ‘1 as the grouping unit, I find 
Mean Selected Coefficient cf Variation = 3°3043 +°0278 
Standard Deviation of Coeff. of Variation = °3458+°0197 
Mean non-Caste Coeff. of Variation = 3°5710 +°0326 
Caste Difference =0°16 67+4°0429 
In this case also the difference is nearly four times the prob- 
able error. We conclude that the Indian Caste samples have got a 
substantially lower Variability than non-Caste samples. 
We shall now compare Anglo-Indian Variability with the 
selected Caste Variability. 
Anglo-Indian Variability = 4'0672 
Selected Caste Variability = 3.3043 


Anglo-Indian Difference = ~7629 
Thus t= ee to 2°806 
34 
From Biometric Table II, k(1+a)="98 64 474 


}(I-—a)="OI 35 526 


The chance is only 13 im rooo that the Variability of an — 
Indian Caste will be greater than Anglo-Indian Variability. This — 
is the lowest odds we have got up till now. 
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To sum up, 

The Anglo-Indian Variability ts significantly greater than Caste 
Variability but 1s not beyond the range of homogeneous Variability. 
Other Comparisons. 

I shall give a short summary of other comparisons, reserv- 
ing a fuller discussion for a future paper on the Caste data 

Pooling together the 84 Caste and the 109 other samples we 
get a total of 193 (all samples). 

I find 


Mean Value of Coefficient of Variation = 3°423I +'0240 
Anglo-Indian Co-efficient of Variation = 4°0672 


Anglo-Indian Difference = °6441 
Standard Deviation = -4949 +'0169 


Thus ES a 
"4949 
From Biometric Table 4(I1+4a)='90 31 995 


3(I— a) ="09 68 005. 
Anglo-Indian Variability would be exceeded by nearly 10% of 
total (Caste and non-Caste) samples. 


Selecting samples greater than 25 we get a total of 137 fairly 
reliable samples. 


Distribution of 137 Selected Coefficients of Variation. 


=2°6 || —2°7 | =2°8 | =2°9 | —3'0 |-—3 1 


Group a a ie Ae heey]. gem SS 7 | =3°S |" 73°9b |) 4:01) 471 
Frequency .. 13 14 II 21 12 9 8 9 I 4 
Group oe | -4'2 | -4°3 | Total. 
Vrequency .. | Fy I 137 
| 
Grouping by ‘1 I find :— 
Mg= 14°42 12 29 
P3= — 17°98 72 90 
fig—-), FO0LO2. 20.19 
Hence | 
B= "10796 +°14439 
B,=  —-3°65892 491379 


Thus we are justified in applying the normal integral for cal- 
culating the chances for any deviation. 
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Anglo-Indian Variability =4°0672 
Mean Selected (137 samples) = 3°4412+4 ‘0219 


Anglo-Indian Difference = *6260 
Standard Deviation = °3797+°0155 


From Biometric Table II, 4(1 + a) ="94 73 839 
4(I—a)="05 26 161 


Thus over 5% will have greater Variability. The difference 
can scarcely be called significant. 


Standard Deviations. 
I shall merely give the final results. (The complete figures 
will be published in a supplement). 
(a) All Samples (Caste and others) total = 190. 
Mean Standard Dev. =57'0684+'42 71 mm. 
Anglo-Indian Standard Dev. Tite 3850 


Anglo-Indian Difference = 10° 10°3167 
S.D. of Standard Dev. = 8°7302' + 3020. 


Eee TIS, 


From Biometric Table II, 43(1+a)= °88 09 ggg. 
4(I1-—a)= ‘17 90 OO. 
Thus nearly 18% will have a greater Standard Deviation 
than the Anglo-Indian sample. 
(b) Selected Samples (Caste and others) greater than 25, total= 134 
Mean Standard Dev. =56°7612+°3987 mm. 
Anglo-Indian Standard Dev. =67°385 


Anglo-Indian Difference = 10°6238 
S.D. of Standard Deviation = 6°8424 


From Biometric Table II, 4(1+a)= 93 94 292 
}(I1-—a)= +0695 708 
Six per cent will have a greater variability than the Anglo- 
Indians. 
(a) All Non-Caste Samples, total = 106. 
Mean Standard Dev. =59'2830+ 6138 mm. 
Anglo-Indian Standard Dev. =67°385 


~~ Ga 


i 
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=< G°kO? 
S.D. of Standard Deviation 


= 9°3688 +°4340 
8102 
a 3688 =0°864, 
4(I1+a)= ‘80 51 055 
3(I—a)= “19 48 945 
Over 19% will have greater absolute variability. 
(b) Selected Non-Caste Samples greater than 25, total=64. 


Mean Standard Dev. =60°6563 + 5453 mm. 
Anglo-Indian Difference = 6°7287 
S.D. of Standard Deviation = 6°4676 + 3856 
6'728 
ro 
4(I+a)= °85083 
x(I—a@)= ‘14917 
(a) All Caste Sambles total = 84 


From Biometric Table II, 


Mean Caste S.D. =53'0714+'4693 mm. 
Anglo-Indian $.D. =67°385 

Anglo-Indian Difference 

S.D. of Standard Deviation 


14°314 
6°3785 + °3320 


X= ——--{- = 2°244, 
6°3785 


a(I+a)= 98 74 545 
_3(I—a)= OT 25 455 
Only 12 in 1000 castes will have a greater variability than 
the Anglo-Indian sample. Thus we may conclude that the Abso- 


lute Variability of the Anglo-Indian sample is appreciably greater 
than Caste Variability. 


Also 


From Biometric Table II, 


Non-Caste Mean S.D. =59'2830+°6138 mm 
Caste Mean $.D. =53'0714+ 4693 


Caste Difference  6°2116+ ‘F727 


Thus Absolute Vaniability of Caste samples ts significantly 
greater than Non-Caste Variability. 


(b) Selected Caste Samples greater than 25, total=70 
Mean Selected Caste S.D. =53°8 
Anglo-Indian $.D. =67°385 
Anglo-Indian Difference =13'585 +2°471 
S.D. of Standard Deviation = 5°4938+ ee 
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From Biometric Table II, }3(1+a@)= 99 32 443 
}(I-—a)= *00 67 557 
Thus only about 7 in 1000 will have greater variability. 


Again, 
Selected Non-Caste S.D. =60°6563+4 5453 mm. 
Selected Caste S.D.=53°8  +°4429 


Anglo-Indian Difference = 6°8563+4°7025 " 


Selected Caste Variability is thus significantly greater. 

We conclude from our comparative study of variabilities that 
Anglo-Indian Variability though high is not sufficiently so to enable 
us to assert that the material is heterogeneous. The Anglo-Indian 
sample is however markedly more variable than the Rislay Samples 
of Indian Castes and Tribes. 

I shall now consider a series of modern European races for 
which reliable data is available. 


Modern European Races. 


S.D. 
Aberdeen students (493) 59°4. mm. 
Cyprus (585) 61°6 Anglo-Indian S.D. =67°385 mm. 
Cambridge students (1000) 64°6 Average European S.D. =65°775 
U.S.A. recruits (25,898) 65°6 
Albanians (140) 65'7 Anglo-Indian Difference = 1°61 
N.S.W. Criminals (2871) 65°8 S.D. of S D. = 2°75 
Oxford students (959) 66°1 Anglo-Indian excess in terms of S.D. 
Germans (390) 66°8 =1°61/2°75 =0"5855 
Crete (318) 67°5 
Eng. Criminals { 3000) 68"1 
Eng. Fathers (1078) 68'7 
Eng. Sons (1078) 69°4 


Thus the Anglo-Indian variability is only 1°61 mm. greater 
than average variability of European races. We have however 
included no less than five different English samples. If we retain 
the largest English sample (3000 criminals) we get the Mean varia- 
bility to be 65°375 mm. with a S.D. of 2513 mm. The Anglo- 
Indian excess is 2° mm. or in terms of the S.D. is 0°79586. 

We conclude that Anglo-Indian Variability is of the same 
order as modern European variability. 


CONCLUSIONS. 


I have proposed five distinct tests of ‘‘ homogeneity.’’ 
I The frequency distribution should be homotypic. 
II It should resist statistical dissection ; 
III Subsamples should not differ significantly ; 
IV The general nature of the distribution should be similar 
to homogeneous distribution. 


V The Variability should not differ significantly from the — 


average Variability of homogeneous races. 


. 


~ 
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(1) I have shown that graduation by the Gaussian (possibly 
still better by a Type IV curve) is adequate. Anglo-Indian fre- 
quency distribution ts certainly homotypic. Our first test thus fails 
to show any sign of heterogeneity in the material. 

(2) Excepting for a very special type of dissection (which is 
probably a peculiar feature of the particular sample considered) 
statistical anaiysis into component groups is not possible. Our 
second test too fails to detect heterogeneity. 

(3) We have seen that the difference between subsamples is 
statistically insignificant. Subsamples seem to agree quile well, 
thus confirming statistical homogeneity of the material. 

(4) The general nature of Anglo-Indian frequency distribution 
ts also similar to other homogeneous distribution. Anglo-Indian 
distribution is approximately Gaussian with some tendency towards 
type IV, lepto-kurtosis and small asymmetry. Other known cases 
of stature distribution show the same characteristics. The fourth 
test thus supports the view that the present material is homo- 
geneous. 

(5) I have compared the Variability of the Anglo-Indian with 
Variabilities of other races in many different ways. 

Anglo-Indians are more variable than the Indian Castes and 
Tribes but the Variability of the Angio-Indian sample is not signi- 
ficantly greater than the average Variability of homogeneous 
samples in general. ; 


SECTION VIII. NOTE ON CORRELATION BETWEEN — 


AGE AND STATURE. 


I shall give a short summary of the values of the Coefficient 
of Correlation between Age and Stature, reserving a fuller discus- 
sion for a future part. 


(a) The whole series (all ages), total=1091. 

The age has been recorded in the case of 191 out of the total 
group of 200 which we have been considering so rah I have used 
the standard ‘‘ product moment ’’ method.! 

I find for stature, with 50 mm. unit of grouping and 1660 mm. 
as base number, 


v=— ‘14136, and 
Thus Mean Stature =1656°I4 mm. 
SD. = “as = 654923 mm. 


For age, with one year unit of grouping and base number =24 
years, 


Vy) = +°27 

and v,, = 44°08. 
Thus Mean Age =24°27__ years 
S.D.= o, = 6°7022 years. 


With the same units and base numbers we find the product 
moment to be + 40° 26 70. 
Correcting for base number; we have 


¥,, = Product moment= + 40°22 


Thus pos eee ee oe 
*& 7022 x 65°4923 


= + ‘1089 
The Probable Error is given by *6745 (1—17)/4/n 
N=IoI, hence P.E. is= +°049. 


We have then 
y= +'1089+ 049. 


‘The correlation coefficient is slightly over twice its Probable 
Error, hence it is not definitely significant. In any case the correla- 
tion between age and stature seems to be small. 

The low average age of the whole sample shows the presence 
of a considerable number of individuals in their early youth. I 
next separated the measurements of those above 25 years of age 


' Yule, Statisties Chap. IX. 
Karl Pearson: ‘' Regression, Heredity and Panminia”’ Piri. Trans. Roy. 
Soc. Vol. 187A, 1896. 
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from the measurements of those below 25, and considered the 
correlation for the two different age groups separately. 


(b) Age group below 25, total=125. 
I find for age 


lI 


Mean Age 20'52 years 


S.D.=9, =  2'2449 years 
For stature, Mean Stature = 1649°35 mm. 
S.D.= oe =) 0135 
Also y= Product moment = + 20°16 


We notice that the average stature of the lower age group is 
only 7 mm. less than the general average. The S.D. is also less 
than the general average, showing that the lower age group is less 
variable than the general sample. I shall come back to this point 
later on. 

We find the coefficient of correlation to be 


y= +°1464+4°058 


The correlation is positive but small. It is just on the verge 
of being significant. The positive character of the coefficient is of 
course expected, it merely indicates, or rather actually measures 
the average rate of growth with age. The material includes only 
a few cases of 16, the lowest age group, and so it is not possible to 
say very much about the actual variations in the rate of growth. 
The smailness of the coefficient (if not due to errors of sampling) 
seems to suggest that the greater part of the increase in stature is 
attained before the age of 16 or 17. Thus the Anglo-Indian seems 
to be, so far as stature is concerned, rather precocious in growth. 
I shall discuss this point after investigating the correlation between 
age and the other characters. 


(c) Age group above 25, total=60. 


I find 
Mean age = 3530 | years 
S.D.= gy = 6°5765 years 
Mean Stature = 16881818 mm. 
Sei = oy = Vn 71072 -" mn. 
Product moment = —55°4884 

Thus, y= —'1187+'08, 


The coefficient is now negative but is scarcely significant in 
view of its large probable error. A small negative correlation is to 
be expected in view of the shrinkage which sets in after 25 or 30. 


i Powys: ‘! Anthropometric Data from Australia,’’ Biometrika Vol, I (1902) 
p- 49. 
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The value of the Absolute Variability is for the 
Lower age group =65°4923+2°7939 mm. 
Higher age group =71°07204+4'1726 mm. 
Difference = 5°5797+45°02 mm. 


The variability of the younger group is thus considerably 
less, but the difference is scarcely significant. Even though we 
cannot definitely assert that the variability is being reduced with 
time, the above noticed decrease is certainly interesting as giving 
an indication that such a view is not altogether untenable. 

If we turn to the Relative Variability, i.e. the Coefficient of 
Variation, we find 

Higher age group = 4°2073+°2470 mm. 
Lower age group =3°9545+°1687 
Difference = 0'2528 + ‘2991 


The difference is less significant than the previous one. But 
the reduction in even the relative variability is distinctly sugges- 
tive. 

Another point must be carefully noted. The variability of the 
Anglo-Indian sample is not significantly diminished by selection 
of age groups. Thus the high value of the variability (both 
absolute and relative) is not merely due to the mixing of the 
different age groups but represents a real degree of dispersion. 


— > 1 


SECTION IX.. SUMMARY OF CONCLUSIONS. 


Statistical. 


(rt) For stature, with samples of the order of 200, a group- 
ing unit of 50 mm. is fairly satisfactory. For calculating fre- 
quency constants the grouping unit should be less than 3'54/N 
(for samples of size N). : 

(2) Sheppard’s corrections lead to substantial improvement 
in the frequency constants and should never be omitted. With 
small samples finer corrections (e.g. Pairman and Pearson) are use- 
less. 

(3) The Gauss-Laplacian normal curve is adequate for 50 
mm. grouping. For proper graduation, i.e. for testing goodness 
of fit the grouping should be broader than 700/4/N., 

(4) The actual frequency curve belongs to Type IV of Pear- 
son’s Skew family. There is small positive asymmetry with the 
Mode greater than the Mean, and a slight tendency towards lepto- 
kurtosis. The general nature of the distribution is similar to cther 
homogeneous distributions. 

(5) There is no definite evidence of statistical heterogeneity. 
The Anglo-Indian sample may be accepted as a s/atistically homo- 
geneous sample. 


Anthropological (Stature). 

(1) The more highly civilised races have greater variabilities 
than the average. 

(2) This greater variability of more highly civilised races seem 
to be only moderate in degree and is never excessive. 

(3) Interracially, taller races seem to be more variable than 
the shorter (both as regards the absolute and the relative variabi- 
lity). : 
(4) Indian Castes and ‘Tribes are significantly less variable 
than the average. 

(5) Anglo-Indian variability is greater than Indian Caste va- 
tiability but is of the same order as the variability of modern 
European races. 

(6) The variability of the Anglo-Indian sample though greater 
than the average is not beyond the range of possibility of homo- 
geneous variability | 

(7) The Anglo-Indians seem to be rather precocious in 
growth, and there is some indication of the arrest of growth 
occurring at an earlier age than in the case of European races. 

(8) Variability of the smaller age-groups is distinctly less, 
showing a decrease of variability with time (or increasing homo- 
geneity of the younger generation). 


APPENDIX I. NOTE ON STATISTICAL TERMS, 


In this appendix I have made an attempt to explain, in non- 
mathematical language, some of the more frequently occurring 
technical terms of statistical theory. Considerations of space have 
prevented me from giving concrete illustrations. I hope however 
that the following pages will serve some useful purpose in helping 
anthropologists who lack the requisite mathematical training, in 
taking an intelligent interest in the various technical discussions 
contained in this paper. I have only attempted to give a general 
idea of the different terms; the statistician will, 1 hope, forgive 
me for the consequent lack of precision in many pl aces, 

Let us consider our 200 measurements of Anglo-Indian stature. 
Almost all individual measurements are different | from one another, 
The existence of variability is patent. The important fact is, 
however, that this variability of stature is not chaotic in its dis- 
tribution, but that it is governed by definite laws.' 

We can classify our material into different grouns in accord- 
ance with size. We find, for example, that there are 2 individu- 
als whose heights are less than 1465 mm. Between 1465 and 1485, 
there is only one. Between 1485 and 1505, there are 4, and so 
on. Thus with a 20 mm. unit of grouping, we get the following 
distribution of frequency in each group. (The number of indivi- 
duals in any group is called the frequency of that group). 


Frequency Groups in units of 20 mm. 


! 


1445 mim. | 1465 mm. 1485 1505 |1525, 1545 1565/1585 1605 pit, 


Group _ to to to to | to to | to 
| 1465 mm. | 1485 1505, 1525 1545 1565 1585 _— 1625 seat 
Nuniber 2 Lex ey 4 | 2 | 4|10] 12 5 | | 32 


| 
1725 | 1745 1765 1785 1803 1825 1845! 


1645|'1665 | 1685 1705 
to.| to | ta '|;: to =| to to | to to | to | to Bus, 
1665| 1685 1705 1725) 1745 1765 | 1785) 1805|1825/1845 1865 


Number 21 ay. 21°5 we kz. S<h10 |. 2°) ho4- o> Ae 
) ) . 


Group 


These frequency groups are shown graphically in Plate I. 

Let the horizontal x-axis represent stature. Then, at the 
middle point of each group, we can erect vertical lines propor- 
tional to the frequency in that group. for example, at 1455, 
which is the middlepoint of the group 1445-1465, we erect 
a vertical line whose length is two units, to represent the 
frequency in that group. At 1475, the height of the vertical 
line is one unit and so on. If the extremeties of these vertical 


Cf. Goring : ‘‘ The English Convict,”’ p. 20. 
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lines are joined by straight lines, we get the corresponding 
frequency polygon. With 20mm. unit of grouping, the polygon 
is broken and irregular in outline, because many intermediate 
measurements are missing in the sample. 

If we gradually increase the size of our sample, more and 
more of these gaps will be filled up and the polygon will become 
more and more regular. On the other hand, with an indefinitely 
large samvle, we can make the size of each group as small as we 
please, without incurring any risk of meeting with. gaps in the 
measurements. ‘Thus, with a very large sample, and when the size 
of each group is indefinitely diminished, the discontinuous broken 
polygon will gradually pass into acontinuous smooth curve. This 
frequency curve will give us the distribution of stature of an indef- 
initely large population. 

Such distributions are usually termed Chance distributions. 
But as Pearson observes,! ‘‘in the first place, we have to recognise 
that our conception of chance is now utterly different from that 
of yore. Where we cannot predict, where we do not find order 
and regularity, there we should now assert that something else - 
than chance is at work. What we areto understand by a chance 
distribution is one in accordance with law and order, and one the 
nature of which can for all practical purposes be closely pre- 
dicted. .... Itis not theory, but actual statistical experience, 
which forces us to the conclusion that, however little we know of 
what will happen in the individual instance, yet the frequency of 
a large number of instances is distributed round the mode in a 
manner more and more smooth and uniform the greater the num- 
ber of instances. .... Our conception of chance is one of law 
and order in large numbers; it is not that idea of chaotic incidence 
that vexed the mediaeval mind.”’ 

The Gaussian distribution (named after the great mathe- 
matician Gauss) is one important standard type. It has got the - 
following characteristics :— 

(a) The frequency is maximum for the average value of the 
organ measured, | 

(6) The distribution is symmetrical with regard to this 
maximum. 

(c) The curve slopes down, gradually and in a characteristic 
way, to zero, so that extreme degrees of variation become increas- 
ingly rare. | 

(d) The curve ends tangentially to the x-axis, so that infinitely 
large degrees of variation are theoretically possible. 

Vanability.—We have not yet investigated the question of vari- 
ability of the distribution. Two frequency distributions may be 
both Gaussian and yet their variabilities may differ widely. 
Anthropologists have often used the range, which is defined as the 
difference in size of the most extreme members, as a measure of 
variability. A little reflection will, however, show that the range 


! Chances of Death, Vol. J, p. 11. 
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is not at all suitable for this purpose. The inclusion in the sample 
of asingle abnormal ‘‘dwarf’’ or “‘ giant ”’ will completely upset 


the value of the range. A measure so radically affected by stray - 


items at the extremes is practically useless for scientific purposes.? 

In current statistical practice it is usual to measure variability 
by the Standard Deviation. ‘The deviation of each measurement 
from the Mean (or Average) is squared. The sum of all such 
squares divided by their total number gives the second moment #z, 
which is thus the average squared-deviation of all the measurements. 
The square-root of #, finally, gives the Standard Deviation. It 
is the average root-square deviation of all the measurements, 
and is a precise mathematical measure of the variability of the 
sample. One great advantage in using the Standard Deviation is 
this that it uniquely defines the corresponding Gaussian curve, so 
that the Gaussian can be found as soon as the Standard Deviation 
is determined. Standard Deviation (or $.D.) is usually represent- 
ed by ¢. 

Probable Evrovs.—The Gaussian distribution is also known 
as the ‘‘normal curve of errors,’ since it is assumed that this 
curve gives the distribution of ‘‘errors” made in physical 
measurements.” The greater the diversity in any set of measure- 
ments the greater will be the Standard Deviation of the set. 
Accuracy or reliability depends on the uniformity of the set of 
measurements, that is, on the smallness of the Standard Deviation 
The “ probable error,’ which measures the accuracy or reliability 
of any set of measurements, is hence suitably defined by a parti- 
cular sub-multiple of the Standard Deviation. 

If * is adopted as the unit of measurements (that is, all 
measurements in termsof ordinary units are divided by @), then the 
curve of errors becomes the standard curve of probability. The 
mathematical theory of probability then enables us to find the 
probability of any given deviation from the Mean occurring in the 
sample. 

For example, a deviation half of the Standard Deviation will 
occur no less than 62 times in 100 samples. A deviation as great 
as the Standard Deviation will occur in 32% instances, while a 
deviation four times as great will not happen more than once in 
17, 000 instances. The Probable Error is defined to be such a 
deviation as will be exceeded by half the total deviations, or in 
other words, the chances are even that any deviation will be great- 
er than or Jess than the Probable Error. 

We must now come back to Anthropology. It is well known 
that almost all anthropometric measurements have an approxim- 
ately Gaussian distribution. This was originally pointed out by 
Quatelet, and since then has been confirmed by many different 


| Kor a simple non-technical account of the different measures of dispersion, 
see King: ‘ Elements of Statistical Theory '’ (MacMillan, 1919), p. 141. 

2 This assumption itself is not always strictly true. See Pearson’s memoir on, 
‘Errors of Judgement, etc.’’ Phil. Trans. Roy. Soc. 198A (1902). 
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observers.! But it must be remembered that the distribution is 
only approximately normal and is almost never exactly so. We are 
thus obliged to study other types of frequency distribution. 

It is often found that the maximum frequency does not occur 
at the Mean value of the character concerned. In such cases, the 
most frequent size, that is, the position of the maximum ordinate, 
is called the Mode. In anthropometric measurements it is very 
usual to find the Mode different from the Mean. When this hap- 
pens, the distribution is no longer symmetrical about the Mean. 
Such asymmetrical. distributions are called skew distributions ” 

The distance between the Mode and the Mean is one obvious 
measure of skewness, or better still (for purposes of comparison), 
this distance divided by the Standard Deviation. The mathemati- 
cal measure of skewness depends on the third moment 3, obtained 
by cubing the deviations from the Mean and taking the average. 
The positive and negative deviations (from the Mean) must, by the 
very definition of the Mean, balance exactly; so that the sum of 
all deviations is zero. For a symmetrical curve this is also true 
of the cubes of deviations. But in the case of an asymmetrical 
curve, the sum of all the cubes of deviations is not zero. Hence 
the third moment, which is merely the average-sum of the cubes 
of all deviations, is not equal to zero. Thus »3; or more con- 
veniently B, =ps2/ps® 1S a precise measure of the degree of asymmetry. 
If A, is significantly diferent from zero, then the curve must be 
considered skew. 

Frequency distributions may differ from the normal curve in 
another particular. The change of slope of the normal curve is 
a characteristic feature of the curve. Now a frequency curve may 
differ from the normal as regards the manner in which its slope 
changes. For example, if a curve rises more abruptly than the 
normal curve, it is then called a lepto-kurtic curve. While if it is 
more flat-topped than the normal, it is called a platy-kurtic curve. 
Curves with the same degree of abruptness as the normal are 
known as meso-kuvitc curves. The kurtosis is measured by 62—- 
For meso-kurtic curves f, is equal to 3, and the kurtosis is zero. 
For lepto-kurtic £, is greater than 3, and for platy-kurtic it is less 
than 3. A frequency curve may also differ from the normal in 
having a definitely limited range. The curve may be limited in 
one of in both directions. With these curves there is a definite 
theoretical limit to the size of deviations. 

The Coefficient of Variation.—Pearson? says, “In dealing with 
the comparative variation of men and women...., we have cor- 
stantly to bear in mind that relative size influences not only the 
means but the deviations from the means. When dealing 
with absolute measurements, it is, of course idle to compare the 


' For references see pp. 42-44. 
* For literature on the subject see references quoted on P. 16,2° Also. fC. 
ie: ‘‘ Skew Frequency Curves in Biology and Statistics.’ 
$ Karl Pearson: ‘‘ Regression, Heredity and Panmixia,’’ Phil. Trans. 
Roy. Soc., Vol. 187A, 1896, pp. 276-277. 
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variation of the larger male organ directly with the variation of 
the smaller female organ. The same remark also applies to the 
comparison of large and small built races .... We may take 
as a measure of variation the ratio of Standard Deviation to mean, 
or what is more convenient, this quantity multiplied by 100. We 
shall, accordingly, define V, the coefficient of variation, as the 
percentage variation in the mean, the Standard Deviation being 
treated as the total veriation in the mean .... Of course, it does 
not follow because we have defined in this manner our “‘ coefficient 
of variation,” that this coefficient is really significant in the 
comparison of various races ; it may be only a convenient mathe- 
matical expression, but I believe there is evidence to show that it 
is a more reliable test of ‘“‘efficiency”’ in a race than absolute 
variation .... By ‘‘ race efficiency,’ I would denote stability, 
combined with capacity to play a part in the history of-civilisa- 
tion.” 


- 
.—s ; 
ee eee ee 


APPENDIX II, 
TABLE OF MEASUREMENTS. 


Stature of Anglo-Indiaus measured in the Anthropological Laboratory of 
the Indian Museum, Calcutta. 


Card Agein Stature | Card Agein Stature 


Card Agein Stature 
No. years. inmm. No. Years. inmm. | No. years. in mm. 
87 15 1446 14B. 20 1673 | 64 23 1472 
166 26 1624 | 168 20 1664. 61 23 1572 
186 16 1588 241 20 1638 ' 269 23 1624 
147 16 1666 156 20 1622 42 2 1646 _ 
144 16 1726 280 20 1622 224 24 1592 
emer ry «1656 | 248 20 1562 : 45 24 I610 
175 17 1588 65 20 1500 | 54 24 1620 
145 17 1388 | 275 20 1510 207 24 1690 
217 20 1514 50 24. 1670 
= a ae 152 20 1610 I 24 1684 
a - ae 7 oe se | 13 24 1696 
219 20 1620 230 24 1634 
oe - a 172 20 1658 | 284 24 1596 
2 18 1746 107 21 1626 | aie iss 1030 
53~. 74 2 16 
141 18 1768 102 21 1646,- |= 4 ae 
132 18 1610 III 21 1650 37 1738 
258 18 1602 287 21 1654 202 2 1513 
86 18 1636 234 21 1656 | 54 2 1580 
IQI 18 1660 99 21 1708 | 285 25 1619 
160 18 1570 133 21 1730 | 48 25 oe 
251 18 1574 IOI 21 1768 3 =. 1034 
94 18 1580 SI 21 1768 | 293 26 1644 
288 19 1638 106 21 1704. | 240 26 1638 
277 19 1636 10 21 1604 | 282 26 1656 
294 19 1634. 281 21 1696 | 263 26 1730 
66 19 1630 28 me 1672 | 2 26 1710 
286 19 1606 Foo4 | 21 1678 | 60 26 1604 
298 19 1614 267 21 1624 | 58 726: 1628 
53 19 1604 88 21 G2) yp he 2ST BO es FOI» 
75 19 1586 9 22 1730 63 27 1522 
288 19 1458 | 148 22 6 is 27, Lee. 
226 19 1768 74 22 1716 119 2a 1692 
4 19 1768 180 22 1700 | 38 27 Pano 
151 19 1760 149 22 1700 39 27 1776 
295 se) 1744 108 3S 16Bak= | 29 24 Es 
174 19 1718 103 22 1688 | 137 27 1840 
p29 =. 1705 170 22 “1677 ZS 27 1656 
6 19 1706 72 22 1650 1 27 1650 
56 19 E780. 1 96 22 1568 31 28 1610 
146 19 1674. ae 1256 232 28 1636 
43 57 
176 19 1686 177 22 1608 223 28 1754 
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